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Chapter  1 
Introduction 

Image  edges  can  be  defined  as  local  changes  or 
discontinuities  in  an  image  attribute  such  as  luminance, 
tristimulus  value,  or  texture  [1].  These  changes  are 
important  in  the  analysis  of  images  because  they  often 
provide  an  indication  of  the  physical  extent  of  objects 
within  the  image.  A-.  operator  used  to  detect  these  changes 
is  called  an  edge  detector.  This  operator  transforms  an 
image  into  a binary  array  containing  ones  where  the 
magnitude  of  the  discontinuity  is  significant  and  zeros 
elsewhere.  The  binary  array  obtained  is  usually  called  an 
edge  map.  This  transformation  is  useful  in  image 
understanding  systems,  because  while  the  edge  map  retains 
much  of  the  basic  structure  of  the  image,  less 
computational  effort  is  required  for  analysis  as  compared 
to  the  original  image. 

1.1  Edge  Detection  Techniques 

There  are  many  techniques  which  can  be  used  in  edge 
detection.  These  include  simple  differential  operators, 
template  matching,  least  square  edge  fitting,  and 
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techniques  based  on  statistical  detection  theory.  There 
are  also  many  heuristic  methods  developed  for  edge 
detection.  A complete  survey  of  all  edge  detectors  is  not 
a simple  task,  and  can  even  be  confusing.  Hence,  only  a 
group  of  the  most  useful  operators  will  be  discussed  in  the 
following  sections. 

Linear  differential  operators  are  commonly  employed  in 

edge  detection.  In  this  method,  edges  are  enhanced  by 

convolving  the  image  with  a set  of  discrete  differential 

operator  masks.  A corresponding  edge  map  is  obtained  by 

thresholding  some  function  of  the  outputs  of  these  masks. 

One  of  the  differential  operators  used  is  the  gradient. 

The  gradient  is  approximately  calculated  by  convolving  the 

image  with  two  masks  that  measure  the  pixels  luminance 

change  in  any  two  orthogonal  directions.  The  sum  of  the 

squares  of  the  masks  output  is  a measure  of  the  gradient 

magnitude  squared.  Roberts  has  used  2x2  masks  to  compute 

the  luminance  oifference  across  the  diagonals  12],  while 

Prewitt  13]  and  Sobel  [4]  have  used  3x3  masks  to  measure 

the  difference  in  the  horizontal  and  vertical  directions. 

Another  differential  operator,  which  has  been  used  in  edge 

enhancement,  is  the  Laplacian  operator.  Examples  of  the 

Laplacian  masks  are  given  in  |1,  3J.  However,  since  the 

Laplacian  operator  is  more  sensitive  to  points  and  lines 

than  to  edges  |5],  it  is  not  an  efficient  method  for  edge 

detection.  In  general,  all  of  the  linear  differential 
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operators  have  the  advantage  of  using  simple  mathematical 
formulas  which  require  short  computation  time.  Their  major 
disadvantage  is  their  sensitivity  to  noise.  One  method  to 
improve  the  performance  of  differential  operators , in  the 
presence  of  noise,  is  to  increase  the  masks  size.  This  can 
be  noticed  in  comparing  the  performances  of  the  Roberts  and 
i.<h?  Sobel  operators.  Another,  and  rather  better  method,  is 
to  design  edge  detectors  taking  into  consideration  the 
effect  of  noise.  This  leads  to  using  template  matching  in 
edge  detection. 

The  problem  of  edge  detection  can  be  reformulated  as 
follows  [ 1 ] : given  a subregion  of  the  image,  find  one 
member  of  a finite  group  of  templates  representing  edges 
and  no  edges,  such  that  this  member  matches  the  subregion 
as  close  as  possible  and  label  the  subregion  accordingly. 
Matching  is  usually  measured  in  terms  of  the  mean  sauare 
difference  between  the  subregion  and  the  templates. 
Calculation  can  be  simplified  by  expanding  the  mean  square 
difference  and  neglecting  the  slowly  varying  terms.  The 
remaining  term  is  the  cross  correlation  between  the 
subregion  and  the  templates.  This  term  should  be  maximum 
for  the  best  match.  Cross-correlation  template  matching 
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searching  sequentially  at  each  point  for  the  best  match. 
In  this  method,  gradient  magnitude  is  equated  with  the 
maximum  response,  and  direction  is  taken  parallel  to  the 
orientation  of  the  corresponding  detector  ( 3 ] . The 
templates  correspond  to  horizontal,  vertical  and  diagonal 
edges.  Other  forms  of  templates  were  later  introduced  by 
Kirsch  [6]  and  Robinson  17] . The  basic  advantages  of  these 
operators  are  that  they  can  be  implemented  with  a 
relatively  small  computation  effort.  In  addition,  proper 
choice  of  the  template  coefficients  gives  almost  optimum 
performance.  However,  optimum  performance  can  never  be 
achieved  since  the  number  of  templates  used  is  always 
finite.  A different  approach  to  achieve  optimum 
performance  was  later  introduced  by  Hueckel. 

In  Hueckel 's  algorithm  |8],  edges  are  detected  by 
fitting  circular  subregions  of  the  image  to  ideal  edge 
models.  If  the  fit  is  sufficiently  accurate,  an  edge  is 
assumed  to  exist  with  the  same  parameters  as  the  ideal  edge 
model.  The  edge  model  used  is  a two-dimensional  step  in  a 
circular  disc.  The  parameters  of  this  model  are  the 
luminance  levels,  the  edge  orientation  and  distance  from 
the  center.  The  accuracy  of  edge  fitting  is  measured  in 
terms  of  the  mean  square  error  criterion.  Hueckel 
introduced  a polar  Fourier  expansion  and  used  the  first 
eight  coefficients  in  the  minimization  procedure.  Although 
this  approximation  simplifies  the  computation  needed,  it 
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affects  the  accuracy  of  the  minimization 
Hueckel  has  not  provided  any  evaluation  of  this  problem. 

Another  method  to  achieve  optimum  edge  detection  is  to 
introduce  statistical  detection  theory  concepts.  In  the 
statistical  model,  images  are  considered  to  be  the  sum  of 
two  components;  the  first  is  an  ideal  image  in  which  edges 
of  different  orientations  and  heights  are  distributed, 
while  the  second  consists  of  a random  additive  noise.  For 
this  model,  edge  detectors  are  designed  to  achieve  an 
optimum  probability  of  correct  decisions.  Griffith  has 
used  this  approach  in  the  analysis  of  scenes  consisting  of 
prismatic  solids.  He  introduced  a detailed  study  of  the 
distortion  and  noise  affecting  the  image,  and  implemented  a 
decision  procedure  based  on  computing  the  probability  that 
a line  representing  a real  edge  is  centered  in  and 
traverses  some  long  narrow  band.  But,  the  computation  of 
this  probability  was  a difficult  '-ask,  and  the  final 
results  were  based  on  many  unjustified  approximations  191. 
A different  approach  to  statistical  edge  detection  was 
proposed  by  Yakimovsky  U')].  In  this  approach,  two 
adjacent  regions  of  the  image  are  tested;  first  assuming 
that  they  have  the  same  average  luminance,  and  then 
assuming  that  they  have  two  different  luminance  levels. 
Maximum  likelihood  estimates  in  both  cases  are  compared, 
and  an  edge  is  indicated  if  it  is  more  likely  that  the 
regions  have  two  different  luminance  levels.  A 
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disadvantage  of  the  Griffith  and  Yakimovsky  algorithms  is 
that  they  are  designed  to  detect  edges  of  a certain 
orientation.  They  are  less  sensitive  to  edges  with  other 
orientations.  To  avoid  this  problem,  the  operator  is 
usually  applied  with  enough  orientations  to  give  uniform 
response.  The  different  results  are  then  combined  to  form 
the  edge  map. 

A completely  different  approach  to  edge  detection  is 

to  use  the  a priori  knowledge  of  the  image  objects  in 

searching  for  their  boundaries.  Examples  can  be  found  in 

the  work  of  Kelly  [11]  and  Chow  [12].  Kelly  introduced  a 

program  for  extracting  an  accurate  outline  of  a man's  head 

from  a digital  picture  [11].  His  method  consisted  of  three 

steps.  First,  a new  digital  picture  was  prepared  from  the 

original;  the  new  picture  is  smaller  and  has  less  detail. 

Then  edges  of  objects  are  located  in  the  reduced  picture. 

Finally,  the  edges  found  in  the  reduced  picture  are  used  as 

a plan  for  finding  edges  in  the  original  picture.  Chow 

studied  the  problem  of  detecting  the  boundary  of  the  human 

heart  in  a cineagiogram  112j.  He  assumed  that  the 

probability  distribution  of  any  small  region  of  the  picture 

that  contains  only  object  or  only  background  is  unimodal, 

and  a region  that  contains  both  object  and  background  will 

be  a mixture  of  the  two  distributions.  The  unimodal 

distributions  are  assumed  t.o  be  Gaussian.  Starting  from 

these  assumptions,  Chow's  algorithm  examines  the 
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probability  distribution  of  the  image  subregions.  If  the 
standard  deviation  is  large,  the  probability  distribution 
is  fitted  to  a bimodal  Gaussian.  The  bimodality  is 
measured  by  computing  the  val ley-to-peak  ratio.  If  this 
ratio  is  high,  the  points  in  the  subregion  are  classified 
as  a part  of  the  object  or  the  background  depending  on 
their  intensity.  Although  the  Chow  algorithm  is  successful 
in  determining  the  boundary  in  single-object  scenes,  it  is 
not  directly  extendable  to  scenes  with  many  objects.  This 
later  case  is  more  important  in  scene  analysis.  Because 
the  previous  operators  are  limited  in  their  applications, 
they  will  not  be  considerd  further  in  this  dissertation. 

1.2  Edge  Detector  Evaluation 

Another  field  of  study  in  edge  detection,  which  has 
not  been  given  enough  consideration,  is  the  performance 
evaluation  of  edge  detectors.  As  stated  in  reference  11], 
this  evaluation  is  difficult  because  of  the  large  number  of 
proposed  methods,  the  difficulties  in  determining  the  best 
parameters  associated  wich  each  technique,  and  the  lack  of 
definite  performance  criteria.  Ont?  method  for  edge 
detection  evaluation  was  suggested  by  Fram  and  Deutsch 
|13j.  In  this  method,  a test  image  in  the  form  of  ideal 
ramped  edge  with  additive  Gaussian  noise  is  used  to 
evaluate  the  performance  of  edge  detectors  suggested  by 
Hueckel , Macleod,  and  Rosenfeld.  Two  parameters  are  used 
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in  this  evaluation,  the  first  is  the  maximum  likelihood 

estimate  of  the  ratio  between  the  number  of  correct 

detections  of  edges  and  the  total  number  of  detected  edges. 
The  practical  significance  of  the  second  parameter  is  not 
clear.  The  results  are  compared  with  human  ability  to 

perceive  edges.  In  this  experiment,  the  results  obtained 
with  the  Hueckel  operator  appear  to  be  inferior.  This  can 
be  partially  explained  by  the  fact  that  the  hueckel 

internal  parameters  used  are  far  from  the  optimum  choice. 
Another  method  for  measuring  the  performance  of  edge 
detectors  was  given  by  Pratt  [1],  This  method  uses  a 
figure  of  merit  which  is  sensitive  to  the  different  kinds 
of  errors  encountered  in  edge  detection:  missing  or 
displacing  a true  edge  and  the  false  detection  of  noise. 
The  figure  of  merit  introduced  has  been  used  to  measure  the 
optimum  performance  of  the  Roberts,  Sobel,  Kirsch,  and 
compass  gradient  operators  in  the  case  of  an  artifical 
image  of  a vertical  edge  with  additive  Gaussian  white 
noise.  The  experiment  shows  that  the  Kirsch  and  the  Sobel 
operators  have  relatively  high  figures  of  merit  followed  by 
the  compass  gradient  operator  and  finally  the  Roberts 
operator.  These  results  agree  with  the  visual  data. 

1.3  Organization  of  Dissertation 

In  the  previous  survey  it  should  be  noticed  that  while 
there  are  many  operators  that  can  be  used  in  edge 
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detection,  the  effort  given  to  the  comparison  and 
evaluation  of  these  operators  has  not  been  sufficient.  A 
quantitative  evaluation  of  the  edge  detectors  is  needed  if 
these  operators  are  to  be  efficiently  used  as  a part  of  an 
image  understanding  system.  The  following  chapters  will  be 
devoted  to  the  introduction  of  quantitative  methods  into 
edge  detection  problems.  In  Chapter  2,  a detailed 
discussion  of  the  basic  edge  detection  operators,  used  in 
this  dissertation,  is  given.  An  image  model  is  developed 
in  Chapter  3,  and  used  to  evaluate  the  performance  of  these 
edge  detection  operators.  In  Chapter  4,  edge  detection  is 
formulated  as  a pattern  classi f icat  ion  problem,  and  a least 
square  error  algorithm  is  used  to  determine  the  edge 
detectors  parameters.  The  figure  of  merit  derived  by  Pratt 
is  used  in  Chapter  5 to  evaluate  the  performance  of  the 
different  operators  in  the  case  of  vertical  or  diagonal 
edges.  The  results  obtained  in  these  chapters  are  used  in 
the  improvement  of  existing  operators  and  in  the 
introduction  of  new  methods  for  edge  detection.  These  are 
given  in  Chapters  6 and  7,  respectively.  In  Chapter  8, 
some  final  conclusions  are  presented. 
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Chapter  2 

Review  of  Edge  Detection  Operators 

The  edge  detectors  of  interest  in  this  dissertation 
can  be  defined  as  local  operators  which  are  able  to  detect 
image  dicontinuities  without  any  a priori  knowledge  of  the 
image  content.  These  local  operators  are  useful  as  a first 
step  in  many  image  understanding  systems.  Most  of  the 
local  edge  detectors  can  be  classified  into  two  basic 
groups.  The  first  is  the  edge  enhancement/thresholding 
methods  that  includes  the  use  of  simple  differential 
operators  and  template  matching.  The  second  is  the  edge 
fitting  technique.  For  purposes  of  design  and  analysis, 
the  input  to  the  edge  detector  is  assumed  to  be  an  ideal 
ramp  edge  as  shown  in  Figure  2.1.  The  function  represented 
in  this  figure  is  usually  the  luminance  attribute. 
Parameters  that  describe  this  edge  are  its  location, 
orientation,  edge  width  and  height.  These  parameters  are 
to  be  estimated  by  the  edge  detector.  One  of  the  factors 
which  determine  the  edge  detector's  performan  e,  is  the 
operator's  accuracy  in  estimating  the  edge  parameters. 


In  this  chapter,  a detailed  analysis  of  some  of  the 

edge  detection  operators  is  given.  Section  2.1  reviews  the 
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edge  enhancement/thresholding  operators.  Section  2.2 
evaluates  the  edge  detectors  performance  using  an  ideal 
edge  model.  Section  2.3  discusses  the  edge  fitting 
techn  ique . 

2.1  Edge  Enhancement/Thresholding  Methods 

The  edge  enhancement/thresholding  techniaue  can  be 
represented  by  the  block  diagram  shown  in  Figure  2.2.  In 
this  model,  the  image  F(j,k)  is  first  convolved  with  a set 
of  linear  spatial  operators  (H^(j,k)},  the  output  G^(j,k) 
is  given  by 

Gi(j,k)  = Hi(j,k)  0 F ( j , k)  (2.1) 

where  i * l,2,...,m.  A nonlinear  function  of  the  set 
(G^(j,k)}  is  then  calculated.  The  output  A(j,k)  is 
described  by  the  equation 

A ( j , k)  — g ( 3 > k ) , G2  ( j , k ) , . . . , G^  ( j , k (2.2) 

Typical  forms  of  the  function  g(.)  are  the  sum  of  squares, 
the  square  root,  the  magnitude,  the  maximum  or  combinations 
of  these  functions.  The  output  A(j,k)  is  a measure  of  the 
discontinuity  at  the  center  of  the  convolving  masks;  it  can 
be  used  to  form  a grey-level  ecoe  map.  In  order  to  improve 
edge  visibility,  and  to  reduce  ^he  edge  map  complexity  at 
the  same  time,  the  grey-level  edge  map  is  compared  with  a 
threshold  t,  and  an  edge  is  detected  if 
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patial 


A { j , k)  > t 


(2.3a) 


while  if 


A ( j ,k)  < t 

(2 

.3b) 

the  decision  is  no  edge.  The  threshold 

t defines 

the 

resulting  edge  map; 

if  it  is  chosen 

too 

h igh , 

then 

low-ampl itude  changes 

will  not  be  detected 

, and 

if  it 

i s 

chosen  too  low,  noise 

can  be  falsely  detected  as 

edges 

m . 

If  an  edge  is 

detected,  it  is 

often 

useful 

to 

determine  its  orientation  and  height.  This  information  can 
be  obtained  from  the  set  {Gj_(j,k)}  , as  will  be  shown  later. 

After  this  general  introduction  to  the  edge 
enhancement/thresholding  technique,  some  important  examples 
of  the  simple  differential  operators  and  template  matching 
operators  will  be  given. 

2.1.1  Simple  Differential  Operators 

This  group  of  edge  detectors  includes  the  Roberts  [2], 
the  Sobel  (4),  and  an  operator  suggested  by  Prewitt  (31. 
The  Roberts  operator  is  applied  on  2x2  subregions  of  the 
image  as  sketched  in  Figure  2.3a.  The  output  A { j , k ) is 
given  by 


A ( j , k ) = 


-f 


2 


+ 


(2.4) 
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a.  2x2  Subregion 

fl  f2  f3 

f4  f5  f6 

f?  fg  fg 

b.  3x3  Subregion 


Figure  2.3.  Image  subregions 


Equation  2.4  can  be  viewed  as  two  convolutions 


X ( j , k)  = 


'0  -1 ' 

.1  0 - 


9 F ( j ,k) 


(2.5a) 


r-i  01 

*k)  - 

L 0 lj 


• F ( j ,k) 


followed  by  the  nonlinearity 


A( j,k) 


= £(X(  j ,k)  ) 2 + (Y(j,k))2] 


(2.5b) 


(2.6) 


Roberts  has  also  introduced  a magnitude  operator,  in  which 
the  discrete  gradient  is  alternatively  calculated  as 


A { j ,k)  = | X( j ,k) | + |Y(j,k) | 


(2.7) 


In  both  operators,  an  edge  is  detected  if  A(j,k)  > t,  where 
t is  a given  threshold.  If  an  edge  is  detected,  its 
orientation  is  given  by 


.....  ir  . . - If  Y ( j ,k) 

0(3,10  = j + tan  l xTjTfn 


(2.8) 


The  angle  0 ( j , k ) is  measured  with  respect  to  the  horizontal 


ax  is . 


Approximations  of  the  discrete  gradient  function  by 
3x3  operators  were  given  by  Prewitt  | 3 ) and  later  by  Sobol 
(4).  These  operators  are  applied  on  3x3  subregions  of  the 
image  as  sketched  in  Figure  2.3b.  The  outputs  X(j,k)  and 
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Y ( j ,k)  are  given  by 


X(j,k)  = 


0 -1 
0 -c 
0 -1 


F(j,k) 


(2.9a) 


Y ( j fk)  = 


-1 

0 

1 


-c 

0 

c 


-1' 

0 

1 


a f ( j , k) 


(2.9b) 


where  the  constants  c is  1 in  the  Prewitt  and  2 in  the 
Sobel  operator.  The  output  A ( j , k ) is  still  given  by 
Eq.  2.6,  while  the  edge  orientation  with  respect  to  the 
horizontal  axis  is  calculated  by 


0 ( j , k)  = tan 


■l/y ( j ,k)\ 


(2.10) 


2.1.2  Template  Matching  Operators 


The  compass  gradient  (3],  Kirsch  [6],  3-level  and 
5-level  operators  17)  are  examples  of  template  matching 
operators.  in  this  technique,  the  input  image  is  convolved 
with  the  set  of  linear  masks  ( H ^(j  , k )}  shown  in  Figure  2.4. 
The  outputs  (G^(j,k)}  measure  the  gradient  components  along 
the  basic  orientations.  The  enhanced  edge  is  formed  as  the 
maximum  of  the  gradient  arrays.  Thus 

A(j,k)  = max  ||Gj  ( j ,k)  j , |G2  ( j ,k)  | , . . . , |Gm(  j ,k)  ||  (2.11) 

If  A ( j , k ) is  greater  than  the  threshold  t,  an  edge  is 
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c)  mask 

Figure  2.4.  Template  matching  operators 


detected  with  orientation  0(j,k)  given  by  the  compass 
direction  of  the  largest  gradient  component.  Because  of 
the  symmetry  of  the  3-level  and  5-level  masks,  they  can  be 
implemented  using  the  first  four  masks  only. 

In  Chapter  1,  it  was  mentioned  that  the  previous  four 
operators  can  be  considered  as  cross-correlation  template 
matching  operators.  This  can  be  shown  as  follows;  assume 
that  it  is  required  to  match  a subregion  of  the  image  with 
one  of  m templates,  where  the  elements  of  the  1 ' th  template 
are  shown  in  Figure  2.5.  The  1 1 th  cross  correlation  is 
given  by 


R£  - 

(b  + 

(2 

.12) 

The  first 

term 

of 

Eq. 

2.12 

is  constant 

for  a 

g 

iven 

subr eg  ion. 

In 

add  i 

t ion 

h i 

s proportional 

to  £ 

a . 
J' 

tfi- 

Thus  maximizing 

Eq. 

2.12 

is 

equivalent  to 

max 

imi 

z ing 

1 

In 

this 

sect 

ion 

a 

survey  of 

the 

edge 

enhancemen 

t/ thre 

shold 

ing 

opera 

tors  has  been 

given . 

It 

should  be 

noticed  tha 

t,  because 

of  the  diver 

s i ty 

of 

the 

operators  used,  it  is  useful  to  compare  the  performance  of 
these  operators  quantitatively.  There  are  different 
approaches  that  can  be  used  in  this  comparison.  One 
example  is  to  compare  the  edge  detectors  outputs  for  a set 
cf  ideal  edges.  This  technique  will  be  considered  in  the 
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Figure  2.5.  Elements  of  the  A'th  template 
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following  section.  Other  methods  that  implement 
statistical  detection  theory  will  be  discussed  in 
Chapter  3. 

2.2  Edge  Detectors  Performance,  Case  of  Ideal  Edge 


In  this 

analysis , 

the  e 

dge 

model  sh 

own 

in 

Figur 

e 2.1 

ised . 

Here  the 

edge 

is 

assumed 

to  be 

of 

zero 

width 

il  step 

function) . 

When 

an 

edge  det 

ector 

is 

appl  i 

ed  on 

this  edge  model,  the  output  will  be  determined  by  the  edge 
position  and  orientation.  To  simplify  the  analysis,  the 
effect  of  each  parameter  is  considered  separately.  First, 
the  edge  is  assumed  to  pass  through  the  center  of  the  edge 
detector  with  general  edge  orientation  <j>.  Second,  the  edge 
is  assumed  to  have  a fixed  orientation  while  its  distance 
from  the  edge-detector  center  is  varied.  In  both  cases  the 
outputs  of  the  different  edge  detectors  are  evaluated. 

2.2.1  Case  of  Central  Edge  with  orientation  <P  , 

The  average  intensities  of  the  different  pixels,  of  a 
2x2  and  a 3x3  image  subregion  containing  a central  edge, 
are  shown  in  Figure  2.6.  These  intensities  are  given  as  a 
function  of  the  edge  orientation  $ . Because  of  the 
symmetry  of  the  edge  detectors,  it  is  sufficient  to  measure 
the  operators  performace  for  0 < $ < 

when  the  Sobel  operator  is  applied  on  this  edge  model, 
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the  values  of  the  output  A*  and  the  estimated  edge 


orientation  are  as  follow. 

/ 4h [sec  (4) ) ] 

A=  /gt'an  (4>T  [t~9tan2  (4> ) +22tan  (4>) -11 ' 


0 < 4>  £ tan  1 (i-) 


(2.13) 


+ [7tan2  (4>)  +6 tan  (4>)  -1]  2 


!]! 


tan  1 (i)  < 4 £ | 


4) 

tan 


0 <_  4>  £ tan  1 (^-) 


-l/TtanWitanlil^X  ta„-l  ,1,  * J 

\-9tan  (4>) +22tan  (4>) -1/ 


(2.14) 


Similar  expressions  can  be  obtained  for  the  other  simple 
differential  operators. 


When  the  Kirsch  operator  is  applied,  the  values  of  A 
and  9 are  as  follows. 


A 


12h  0 < 4>  £ tan-1}1) 

h[12~<3ttan4)1->-2]  1 $ 1 tan_1{j)  (2.15) 

U |\  o (1-tan  (4> ) ) 2 1 s x.  ^ v 

h[12 ti HTfJ J tan  (?}  - * - * 


0 <_  4>  <_  tan  1 ( y) 
tan  (j)^  ^ 1.  J 


(2.16) 


Similar  expressions  can  be  obtained  for  the  other  template 


matching  operators. 


* Starting  with  this  section,  the  (j,k)  coordinates  are 
dropped. 
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Plots  of  the  values  of  A and  9 for  different  edge 
enhancement/thresolding  operators  are  given  in  Figures  2.7 
and  2.8.  In  these  curves,  the  value  of  A is  normalized 
with  respect  to  its  value  for  a vertical  edge.  From  these 
curves,  it  is  clear  that  all  the  edge  detectors  are  not 
isotropic  because  A varies  with  . This  variation  is 
smaller  in  the  template  matching  operators  compared  to  the 
simple  differential  operators.  Also,  the  estimated  edge 
orientation,  9 is  usually  different  from  the  actual 
orientation,  <p  . This  difference  is  smaller  for  the  simple 
differential  operators  than  for  the  template  matching 
operator  This  is  basically  because  the  template  matching 
operators  measure  the  edge  orientation  in  a quantized  step. 

2.2.2  Case  of  a Fixed-Orientation  Edge  with  Varying 
Displacement 

In  this  case,  the  edge  is  assumed  to  have  a fixed 
orientation,  while  its  distance  to  the  center  of  the  edge 
detector  is  changed.  The  edge  orientations  chosen  are  the 
vertical  and  the  diagonal,  with  <P  - 0 and  tt/4, 
respectively.  Similar  results  can  be  obtained  for 
horizontal  and  — tt/4  orientation  edges.  These  are  the  only 
edge  orientations  for  which  the  continuous-edge  shape  is 
preserved  after  sampling. 

The  intensities  of  the  different  pixels  for  a 

displaced  vertical  edge  are  shown  in  Figure  2.9.  When  the 
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Figure  2.7.  Edge  gradient  amplitude  response  as  a 

function  of  actual  edge  orientation  for 
2x2  and  3x3  operators 
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Figure  2.8.  Th»  detected  edge  orientation  as  a 

function  of  actual  edge  orientation  for 
2x2  and  3x3  operatOLS 
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Sobel  operator  is  applied  on  this  edge  model,  the  value  of 
the  output  A is  given  by 


A 


4h  ° < d < ~ 

4h  (f  d)  < d < | 


When  the  Kirsch  operator  is  used,  A is  given  by 


(2.17) 


A 


12h(|+l)  0 ' d < j 

15h(|-d)  i < d < 1 


(2.18) 


Plots  of  A for  the  different  operators  are  shown  in 
Figure  2.1Ua. 


In  the  case  of  a diagonal  edge,  the  average 
intensities  become  a second  order  polynomial  of  the 
distance  across  the  diagonal.  The  output  A for  the  Sobel 
operator  is  given  by 


A = 


h(3-2d2) 

h [1-  (d — --)  2-t2  (/2-d)  2] 

n 

I h<-i-d)2 
ft 


0 < d < -- 

~ ~ 1 

— < d < /2 

/2  “ 

/2  < d < — 

“ H 


(2.19) 


and  for  the  Kirsch  operator 


A = 


h f 5+10  (1-d2 ) - (— — d) 2 ) 

n 

h i 5-5  (d — — ) 2 + 2 ( /5-d)  2 ) 
/? 

5h  (— — d ) 2 

/? 


0 < d < — 

- - n 

— < d < /2 
/?  ~ " 

/j  < d < — 
“ “ /J 


(2.20) 
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gradient  amplitude. 


a)  vertical  edge 


Figure  2.10.  Edge  gradient  amplitude  response  as  a 
function  of  edge  displacement  for  2x2 
and  3x3  operators 
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edge  gradient  amplitude, 


edge  displacements 


b)  diagonal  edge 


Figure  2.10.  (Continued) 
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Plots  ol  A tor  the  ditterent  operators  are  given  in 
Figure  2.1U.  In  these  curves,  A is  normalized  with  respect 
to  its  value  tor  a central  edge.  These  curves  can  be  used 
to  determine  edge  detector  resolution.  It  should  be 
noticed  that  small  size  operators  have  better  resolution. 
Also,  tor  operators  with  the  same  mask  size,  the  resolution 
is  slightly  dependent  on  the  mask  shape. 

The  results  obtained  m this  section  show  that  edge 
detector  petormance  m the  case  ot  edges  with  general 
location  and  orientation  can  be  approximately  determined 
trom  their  pertormance  m the  case  ot  central  edges  with 
vertical  or  diagonal  orientations.  This  last  case  is  used 
as  the  ideal  edge  model  m the  following  chapters. 

2.J  Edge  Fitting  Method  - Hueckel’s  Algorithm 

In  edge  tittmg,  the  image  function  F(x,y)  defined 
over  a subregion  is  compared  with  an  ideal  edge  model 
S^(x,y),  where  £ is  the  edge  parameters  vector.  The 
ditterence  between  the  actual  and  ideal  models  is  tunction 
ot  £,  and  by  changing  these  parameters  the  ditterence  can 
be  minimized.  Edge  acceptance  is  based  on  the  value  ot  the 
minimum  ditterence.  It  it  is  less  than  a given  threshold 
t,  the  image  subregion  is  classified  as  an  edge  with  the 
corresponding  parameter  Emin*  Usually  the  mean  square 
error  is  used  to  measure  the  ditterence  between  the  ideal 
and  actual  edge.  This  error  is  given  m the  form 
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(2.21) 


[F(x,y)-S^(x,y) ] 2dxdy 


Minimization  of  the  error  E can  be  obtained  by  an 

P 

iterative  procedure  which  is  time  consuming.  However  it  is 
possible  to  introduce  approximations  of  Eq.  2.21  such  that 
its  minimization  can  be  achieved  by  simple  analytic 
methods.  This  was  the  basic  contribution  of  hueckel  in  his 
papers  published  in  1971  and  1973.  In  the  first  paper, 
Hueckel  used  an  orthogonal  transformation  to  solve  the 
problem  of  edge  fitting  [8].  Later,  he  extended  his  ideas 
to  general  edge-line  fitting  [14].  The  Hueckel  algorithm 
can  be  summarized  as  follows:  A circular  subregion  of  the 
image  is  compared  with  the  edge  model  shown  in  Figure  2.11. 
The  luminance  function  Sp(x,y)  of  this  edge-line  model  is 
given  by 


b_ 

A < r_  < r+ 

b_+t_ 

r_  < A < r+ 

(2.22) 

b_+t_+t+ 

r_  _<  r+  < A 

where 


E = 


c 

y 


r 


(2.23) 


The  functions  F(x,y)  and  S (x,y)  are  expanded  using  a set 

E 

of  two  dimensional  orthogonal  functions  (H^Jg.  This  set  is 
chosen  to  be  separable  into  the  product  of  an  angular  and 
radial  component.  The  error  is  now  in  the  form 
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(2.24) 


E = S (a;  -s  . ) 2 
£ i=0 

where 


a 


i 


(x,y)F  (x,y)dxdy 


(2.25) 


! 


si  = j J Hi (x,y)S^(x,y)dxdy 
b 

(2 

.26) 

The  series 

in  Eq.  2.24  is  approximated  by 

its 

first 

nine 

components. 

The  minimization  of  this  truncated  form 

and 

calcula  t ion 

of  the  corresponding  p . can 

Sm 

be 

achieved 

by 

solving  simple  algebraic  equations.  Hueckel 

argued 

that 

the  truncation  of  the  error  series  does 

not 

affect 

the 

per  formance 

of  his  algorithm  because 

high  frequency 

components 

are  more  related  to  image  noise  than  to 

its 

signal  contents. 


The  Hueckel  algorithm  has  been  considered  by  many  as 
an  almost  optimum  procedure  for  edge  detection.  A detailed 
analysis  of  this  algorithm  shows  that  this  is  not  true. 
The  basic  difficulties  with  the  Hueckel  algorithm  are  the 
effect  of  the  truncation  of  the  series  expansion  and 
inaccuracies  in  the  minimization  procedure  and  computation 
of  the  edge  parameters.  These  problems  are  discussed  in 
Appendix  A. 
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A major  criticism  of  the  previous  approach  to  edge 
fitting  is  the  fact  that  although  images  are  usually 
discrete  functions,  the  optimization  procedure  is  derived 
in  the  continuous  domain,  thus  the  results  obtained  are 
suboptimum.  This  difficulty  can  be  avoided  by  using  the 
discrete  image  model  in  the  derivation  of  the  minimization 
procedure.  An  algorithm  based  on  this  idea  will  be 
introduced  in  Chapter  7. 

2.4  Conclusion 

In  this  chapter  a review  of  some  of  the  basic  edge 
detection  operators  has  been  given.  The  operators  chosen 
have  the  advantage  of  possessing  simple  mathematical 
formulas  defined  over  a small  region  of  the  image,  and  thus 
it  is  not  difficult  to  introduce  a quantitative  evaluation 
of  their  performance.  In  Chapters  3,  4,  5 and  6,  different 
quantitative  methods  are  used  in  the  design  and  evaluation 
of  the  edge  enhancement/thresholding  operators.  In 
Chapter  7,  further  investigation  of  the  edge  fitting 
technique  is  given. 
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Chapter  3 

Statistical  Model  for  Edge  Detection 

One  of  the  methods  which  can  be  used  in  the  evaluation 
of  edge  detection  operators,  is  to  test  their  performance 
in  the  case  of  an  ideal  signal  with  additive  noise.  This 
test  is  easy  to  implement.  In  addition,  it  the  noise  is 
assumed  to  be  additive,  white,  and  Gaussian,  analytical 
results  are  not  difficult  to  derive.  Since  edge  detectors 
are  used  to  classify  different  illumination  inputs  into 
edges  or  no  edges,  their  performance  can  be  tested  by 
introducing  inputs  in  the  form  of  a noisy  edge,  or  no  edge, 
and  then  estimating  the  probability  of  making  the  right 
decision  in  each  case.  The  following  sections  develop  a 
statistical  model  for  edge  detection.  Section  3.1  is  a 
review  of  different  decision  rules  used  in 
hypothesis-testing.  Section  3.2  evaluates  the  performance 
of  the  edge  detectors  for  noisy  edges.  Section  3.3 
discusses  the  estimation  of  the  edge  orientation. 

3.1  Edge  Detection  as  a Hypothesis-Testing  Problem  [4,  15, 
and  16J 

In  Section  2.1,  the  edge  enhancement/thresholding 
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technique  was  described  in  detail.  This  technique  closely 
resembles  the  hypothesis-testing  algorithms  used  in 
classical  statistical  decision  theory.  The  edge 
enhancement/thresholding  operators  have  as  an  input  an 
image  subregion,  with  one  of  two  hypotheses  to  be  true, 

H^:  The  subregion  corresponds  to  an  edge; 

H^:  The  subregion  corresponds  to  a no  edge. 

The  edge  detector  calculates  a function  A of  the  input 
image,  and  accepts  one  of  the  two  hypotheses  according  to 
the  rule:  Accept  if 

A > t (3.1) 

otherwise  accept  H 2* 

If  the  input  image  is  noise  free,  it  is  possible  to 
find  a perfect  decision  strategy.  On  the  other  hand,  if 
the  image  is  affected  by  noise  there  will  always  be  a 
possibility  of  making  a wrong  decision.  For  this  case, 
four  probabilities  can  be  derived 


P (edge | edge)  = P(A>t|edge) 

(3.2) 

P(no  edge | no  edge)  = P(A<t|no 

edge) 

(3.3) 

P(no  edgejedge)  = P(A<t|edge) 

(3.4) 

P (edge) no  edge)  = P(A>t|no  edge) 

(3.5) 

two  equations  correspond  to 

correct 

decisions , 
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while  the  other  two  correspond  to  incorrect  decisions. 


If  the  probabilities  of  occurence  of  edges  end  no 
edges  in  a given  image  are  known,  then  the  probability  of 
error  will  be  in  the  form 


P (error)  = P(no  edge  j edge)  P (edge)  --P  (edge  ; no  edge)  (3.6) 
• P (no  edge) 

A decision  procedure  to  minimize  this  probability  of  error 
is  given  by  the  rule:  Decide  an  edge  if 


P ( A 

edge) 

p (A 

no  edge) 

P (no  edge) 
P (edge) 


(3.7) 


and  decide  no  edge  otherwise.  This  method  is  known  as  the 
Bayes  decision  rule  for  minimum  probability  of  error.  In 
Eq.  3.7,  p(A ledge)  and  p(A  |no  edge)  are  the  conditional 
probability  density  functions  of  A.  A sketch  of  these 
probabilities  is  shown  in  Figure  3.1.  The  threshold  t is 
set  at  a value  which  satisfies  Eq.  3.7.  In  the  special 
case,  it  edges  and  no  edges  are  equally  probable, 

t = a (3.8) 


where  a is  the  point  of  intersection  of  the  two  conditional 
probabil it ies. 

If,  in  addition,  the  costs  of  taking  ore  of  the  four 
decisons  are  known,  namely  C (edge  ledge) , ...  , 

C(no  edge| no  edge),  then  a decision  procedure  to  minimize 
the  average  cost  is  to  decide  an  edge  if 
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(3.9) 


p (A | edge)  [C  (edge | no  edge) -C (no  edge | no  edge) ] 

p ( Aj> o edge)  ^ fC(no  edge | edge) -C (edge | edge) ] 

P(no  edge) 

P (edge) 

Otherwise,  decide  no  edge.  The  threshold  t can  be 
specified  accordingly. 


In  more  general  cases,  when  the  probabilities  of  edges 
or  no  edges  are  not  known.  The  threshold  t can  be  set  by 
one  of  the  following  two  methods. 

\ 

In  the  first  method,  t is  set  to  achieve  a given 
probability  of  missing  an  edge,  P(no  edge|edge),  while 
minimizing  the  probability  of  false  detection, 
P(edge|no  edge).  In  this  case,  t is  the  solution  of  the 
equation 

t 

P(no  edge | edge)  = j p(A|edge)dA  (3.10) 

— TO 

This  method,  known  as  the  Neynan-Pear son  criterion,  is 
frequently  used  in  Radar  detection. 


In  the  second  method,  t is  set  to  minimize  the  maximum 
possible  error,  that  occurs  when  tho  probabilities  of  edges 
or  no  edges  change  for  different  input  images.  In  this 
case  the  edge  detector  threshold  is  chosen  such  that 


or 


P(edge|r.o  edge)  = P(no  edge  [edge) 


(3.11a) 
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00 


p(A|no  edge)dA  = 
' t 


t. 

p (A | edge) dA 


This  is  known  as  the  minimax  criterion. 


(3.11b) 


Any  of  the  previous  decision  strategies  can  be  used  in 
the  design  of  edge  detectors,  especially  the  Neyman-Pear con 
criterion,  which  does  not  require  the  knowledge  of  the 
probabilities  of  edges  or  no  edges.  After  choosing  the 
threshold  t,  the  performance  of  the  edge  detector  can  be 
evaluated  as  a function  of  the  probabilities  of  detection 
and  false  detection.  Computation  of  these  probabilities 
for  the  edge  enhancement/thresholding  operators  is  given  in 
the  following  section. 

3. 2 Edge  Detector  Performance,  Case  of  Ideal  Edge  Plus 
Noise 


In  the  model  used  in  this  section,  an  image  subregion 

is  considered  to  be  the  sum  of  two  components.  The  first 

is  an  ideal  central  edge  with  orientations  $ = 0 or  tt/4, 

while  the  second  is  an  additive  white  Gaussian  noise  with 

zero  mean  and  standard  deviation  o.  The  actual  intensity 

f is  then  given  by 
1 


f . = s . + n . (3.12) 

1 1 3 


where  s . and 
1 

respect ivel y. 


n^  are  the  ideal  and  noise  components. 
The  random  variable  f ^ has  the  probability 
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density  function 


p(f  j) 


(27ra2y^exp  ["-  j Sj* 




(3.13) 


When  an  edge  detector  is  applied  on  this  image  model,  the 


output  of  the  i ' th  convolving  mask  is 

g iven 

by 

Ga  = ( j ) f j 

i 

(3.14) 

j 

where  M.(j)  are  the  components  of  the 

mask 

Hi- 

In  this 

case  {G.}  will  be  joint  Gaussian 

with 

the 

probcMl  ity 

density  function 

p(G)  = (2ir)  2Er%e*p[4<6-G)TE’1<6-6)r 


(3.15) 


In  Eq.  3.15,  G and  G are  vectors  of  the  actual  and  ideal 
masks  outputs  given  by 


G = 


(3.16) 


G = (G, 


G ]T 
m 


(3.17) 


with 


G.  = 2 M, 


( j ) s . 
i J ] 


(3.18) 


Also,  the  covariance  matrix  ^ is  given 


by 
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(3.19) 


with 


ki 


2 


o 


£ < J)M£ ( j) 

j 


(3.20) 


The  analysis  introduced  so  far  applies  to  both  simple 
differential  and  ten.  te  matching  operators.  To  obtain 
expressions  for  the  probability  density  function  of  A,  each 
group  of  edge  detectors  has  to  be  considered  separately. 


3.2.1  Simple  Differential  Operators 


With  the  Roberts,  Sobel,  and  Prewitt  operators,  two 
convolving  masks  are  used.  The  outputs  X and  Y are  joint 
Gaussian  with  mean  and  covariance  matrix  as  given  in 
Table  3.1. 


From  Table  3.1, 
variables  X and  Y 
function  used  is  the 


it  can  be  noticed 
are  independent, 
square  root,  then 


that  the  random 
I f the  noni  inear 


A 


(X2+Y 


2>% 


(3.21) 


and  the  probability  density  function  of  A in  the  case  of  no 
edge  is  given  by  |17).  Thus, 
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TABLE  J.l 

Mean  Vector  and  Ct, variance  Matrix  of 
Differential  Gradient  Operators 


P (A)  = 


b exphr*] 


o 


A > 0 


A < 0 


(3.22) 


while  in  the  case  of  an  edge 
A 


exp 


p(A)  = 


, . 2 2 
(A  +a 


2o 


&}4p) 

r -J  r 


A > 0 


0 A < 0 

where  is  the  diagonal  elements  of  ^ , and 


(3.23) 


2 ~2  ~2 
a = X + Y 


(3.24) 


In  Eq.  3.23,  Ig(*)  is  the  modified  Bessel  function  of  zero 
order . 


The  previous  probability  density  functions  can  be  used 
to  determine  the  probability  of  false  detection  and  the 
probability  of  correct  detection  PD,  for  a given  threshold 
t.  These  probabilities  are  of  the  form  (18) 


P 


F 


(3.25) 


where  Q(a,b)  is  Marcum's  Q-function  defined  as 


(3.26) 


Q(a,b) 


r ® 2,  2i 

| a -tx 

x exp x — 

) b L 


Ig  (ax) dx 


(3.27) 
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If  the  nonlinear  function  used, is  the  sumof  magnitudes 


A = | X | + | Y | (3.28) 

the  probability  density  function  p(A)  can  be  derived  in  the 
form 

/.x  1 / A2+a2\ 

P (A)  = exp  ( y ] 

2/jfo  \ 4o  / 

r ' r * 

[p1(X,Y)+p1(X,-Y)+p1(-X,Y)+p1(-X,~Y)  ] 

where 


<v  T ( X-Y ) A-XY  "I  T ,/a+X+y\  ,/a-X-y\"| 

).  (X,Y)  = exp  '—x erf  ]+erf  } (3. 

1 L 2o  2 J L \/?o  / \/?o  / J 


30) 


The  corresponding  probabilities  P and  P are 

F D 


In  the  previous  equations 


erf (x) 


(3.31) 


(3.32) 


(3.33) 


To  compare  the  performance  of  the  Roberts,  Sobel  and 
Prewitt  operators,  the  probability  of  correct  detection  PD 
is  plotted  as  a function  of  the  probability  of  false 
detection  PF.  Figure  3.2  presents  such  plots  tor  vertical 


1.0  L 


Sobel  diagonal 
& Prewitt  vertical. 

Sobel 
vertical  & 

Prewitt 
diagonal 


a)  SNR  = 1.0 


Figure  3.2.  Probability  of  detection  versus  probability 
of  false  detection  for  simple  differential 
operators 
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Sobel  diagonal  & Prewitt  vertical 


Roberts 
cal 


/ Roberts  diagonal 
Sobel  verticals,  Prewitt 

g 


b)  SNR  = 10.0 
Figure  3.2.  (Continued) 


• 

' 

and  tt/4  edges,  with  signai-to-noise  ratios,  SNR* 


1 . U and 


1 (J . (J . From  these  curves  it 

Prewitt  operators  are  superior 
The  prewitt  operator  is  better 
a vertical  edge.  But,  tor  a 
operator  is  z iperior. 


is  clear 
to  the 
than  the 
d iagonal 


that  the  Sobel  and 
Roberts  operator. 
Sobel  operator  tor 
edge,  the  Sobel 


3.2.2  Template  Matching  Operators 


With  the  compass  gradient,  Kirsch,  3-level,  and 
5-level  operators,  eight  convolving  masks  are  used.  The 
output  vector  G is  a joint  Gaussian  with  mean  and 
covariance  matrix  as  given  in  Table  3.2.  The  mean  G is 

<v  *v 

zero  tor  no  edge,  and  G tor  tt/4  edge  is  the  same  as  G tor 
vertical  edge  with  all  the  components  shitted  one  position 
downward . 


For 
str  a lght 
evaluated 
an  example 


these  operators,  computation  ot  p(A)  is 
torward.  However,  their  performance  can 
using  the  probability  density  function  p ( G ) . 


not 

be 

As 


"The  signal-to-noise  ratio  is  defined  as 


SNR  - 


\ noi 


edge  height  \ 

se  standard  deviation J 
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(A  > 

1 1 no  edge) 

(IG^ 

> t • OR*  | G 

J8 


ft 


= 1-  ...  p(G|no  edge) dG^dG2 . . . dGg  (3.34) 
-t  -t 

Equation  3.34  can  be  evaluated  numerically  using  the 
parameters  in  Table  3.2.  In  Figure  3.3,  PD  is  plotted  as  a 
function  ot  Pp  tor  the  different  template  matching 
operators  tor  SNR  = 1.0  and  iU.O.  From  these  curves,  it  is 
clear  that  the  3-level  and  5-level  operators  have  the  best 
performances,  followed  by  the  Kirsch  and  finally  the 
compass  gradient  operator.  This  can  be  explained  by  the 
tact  that  with  the  Kirsch  and  compass  gradient  operators 
more  points  are  used  in  evaluating  A,  and  thus,  more  noise 
is  introduced,  while  these  points  are  combined  in  such  a 
way  that  they  do  not  enhance  the  edge  output. 


3.3  Estimation  ot  the  Edge  Orientation 


The  analysis  in  the  previous  section  can  be  extended 
to  the  estimation  ot  edge  orientation.  For  the  simple 
differential  operators,  the  edge  orientation  is  determined 
by  the  angle 

It  X and  Y correspond  to  no  edge,  they  are  zero  mean 
Gaussian  random  variables.  In  tms  case,  0q  is  a random 
variable  with  p ( Q ^ ) given  by 
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b)  SNR  = 10.0 


Figure  3. 


(Continued) 


P(V  - 5? 


(3.36) 


for  0 < 0^  < 2 7T  .If  Y and  X correspond  to  an  edge,  their 
means  are  non  zero  in  general,  and  P(0q)  is  given  by  |19) 


e<V  = e-W 


t 


1+2/rFacosY^- 


/l+2erf 


(aecsxV 


2 2 
exp (a  cos  y) 


' (3. 
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where 


•-(£?) 


(3.38) 


and 


-1  Y 

Y = 9n-tan  v 


(3.39) 


The  conditional  probability  of  estimating  the  edge 
orientation,  within  a tolerance  A <f> , given  that  the  region 
corresponds  to  an  edge  with  orientation  «J>,  is  in  the  form 


P (4>— A4><0^4>+-A4>  | edge , <<p ) = 


0+ A0 


p (0  ! edge , <'if>) 


(3.40) 


it  should  be  noticed  that  the  probability  of  the  exact 
estimation  of  the  orientation  of  a noisy  edge  is  zero. 


For  the  template  matching  operators,  the  detection  ot 

as 

1 s 


the  edge  orientation  angle  can  be  considered 
multiple-hypotheses  testing.  It  the  actual  edge  angle 
0,  the  probability  of  making  a correct  decision  is 
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P(G=Oi |cdge,<ei)  = P (Gi>GRVk | edge, <0.. ) 


r 

fGl  -.[ 

o 

II 

rt 

P 

’ll 

l 

3 

p(G|edge, <9^)dG^dG2. . . dGg 


Mi 


(3.41) 


Equation  3.41  can  be  evaluated  numerically. 


Since  the  estimation  ot  the  edge  orientation  is 
affected  by  more  sources  ot  error,  compared  with  the 
detection  ot  the  edge  presence  or  absence,  this  additional 
information  should  be  used  carefully.  An  unwise  usage  ot 
the  estimated  edge  orientation  may  reduce  edge  detector 
performance.  More  research  is  needed  to  find  an  optimum 
strategy  tor  using  edge  orientation  information. 

3.4  Conclusion 

In  this  chapter,  a statistical  model  tor  edge 

detection  has  been  developed.  The  performance  ot  the 

different  edge  detectors  is  evaluated  tor  actual  central 

edges  with  specific  edge  orientations.  The  success  in 

introducing  such  a model  helps  in  transferring  the 

communication  theory  concepts  into  edge  detection  problems. 

This  is  a major  point  in  the  analysis  and  design  ot  edge 

detectors,  because  many  problems  in  edge  detection  have 

already  been  solved  in  communication  theory.  It  is 

interesting  to  notice  that  the  magnitude  and  angle  ot  the 
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simple  dilierential  operators  have  the  same  probability 
density  functions  ot  the  envelope  and  phase  ol  nairowband 
signal  with  additive  Gaussian  noise  119] . Other  examples 
can  be  noticed  and  used  successfully. 


Chapter  4 

Edge  .Detection  as  a Pattern  Classification  Problem 


Edge  detection  as  a hypothesis-testing  problem  was 
presented  in  Chapter  3.  Another  approach,  which  is 
introduced  in  this  chapter,  is  to  consider  edge  detection 
as  a pattern  classification  problem.  The  edge  detector  has 
as  its  input  different  image  subregions,  and  it  is  required 
to  classify  these  subregions  into  the  class  of  edges 
and  the  class  of  no  edges  The  decision  strategy  given 

by  Eq.  2.3  can  be  written  in  the  form 

If  w { 1 ) A + w ( 2 ) > 0 then  A e 5^  (4.1a) 

and  if  w(l)A  + w(2)  < 0 then  A e (4.1b) 

T 

where  the  weighting  vector  w = (w(l)  w ( 2 ) ) is  related  to 
the  threshold  t by  the  relation 


t 


w (2 ) 

wiry 


(4.2) 


The  components  of  w arc  obtained  by  training  the  edge 

detector  using  a set  of  known  edge  and  no  edge  patterns. 

After  this  training  phase,  the  edge  detector  is  used  to 

classify  unknown  prototypes  in  actual  images.  The 
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performance  with  actual  images  will  depend  on  the  procedure 
used  in  the  training  phase.  There  are  different  methods 
that  can  be  used  in  training  a pattern  classifier.  A 
review  of  these  methods  is  given  in  Section  4.1.  One  of 
these  methods,  the  Ho-Kashyap  algorithm,  will  be  used  in 
the  edge  detectors  design.  The  basic  concepts  of  this 
algorithm  and  the  reason  behind  its  choice  are  discussed  in 
Section  4.2.  Experimental  results  are  summarized  in 
Section  4.3. 

4.1  Training  Methods  for  Pattern  Classifiers 


The 

decision 

function  in 

Eq.  4.1  is  based 

on  the 

scalar 

var iable 

A.  This 

decision  function 

can  be 

generalized  to  the  n-d imens ional  case 

d(xn)  = w^xn  + w(n+l)  (4.3) 

where  x = (x ( 1 ), x ( 2 ),..., x ( n ) J T is  the  pattern  vector  and 
~n 

w = [ w ( 1 ) ,w ( 2 ) , . . . , w ( n) | r is  the  weight  vector.  Usually, 
n 

Eq.  4.3  is  expressed  in  the  form 

d(x)  = wjx  (4.4) 

T 

where  x = (x(l),x(2),...,x(n),l)  is  on  augmented  pattern 

T 

vector  and  w = iw ( 1 ) ,w ( 2 ) , . . . , w ( n) , w ( n+ 1 ) | is  an 
augmented  weight  vector,  |2U|.  The  decision  strategy  is 
then 
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(4.5a) 


If 


T 

w x > 0 


x e 


T 

and  if  w x 

< o • • ■ y 

X £ 

(4 

.5b) 

In  the  training  phase, 

the  pattern  classifier  is  given 

two 

sets  of 

prototype 

patterns 

in’  £ 

and 

{-N+l'-N+2' 

• • • *x  t Tho 

weight  vector 

w 

is 

determined 

such  that 

T 

w x > 0 

for  all  patterns  of 

Q 

and 

wTx  < 0 for 

all  patterns  of 

If  the  patterns  of 

9.2 

a re 

T 

multiplied  by  (-1),  the  required  condition  becomes  w x > 0 
for  all  patterns.  The  pattern  classification  problem  is 
then  reduced  to  finding  a vector  w such  that 


X w > 0 


(4.6) 


is  satisfied,  where 

X = 


(4.7) 


if  there  exists  a w which  satisfies  Eq.  4.6,  the  classes 
are  said  to  be  separable;  otherwise  they  are  nonseparablc 


120)  . 


One  approach  tc  *he  solution  of  the  set  of  linear 

inequalities  of  Eq.  4.6  is  to  define  a criterion  function 

J (w)  that  becomes  minimum  if  w satisfies  Eq.  4.6.  This 

reduces  the  problem  to  one  of  minimizing  a scalar  function; 
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a problem  that  can  be  solved  by  a gradient  descent 
procedure  [4],  An  example  of  a criterion  function.,  that 
can  be  used,  is  the  perceptron  criterion  function 

Jp(w)  = £ (-wTx)  (4.8) 

xcy. 

where  X is  the  set  of  samples  misclassif led  by  w.  Another 
example  is 

J <wl  . i E <«Vb)2  (4.9) 

r — 2 _ r- 

Hill  2 

where  now  X is  the  set  of  samples  for  which  wTx  < b.  The 
previous  two  criterion  functions  focus  their  attention  on 
the  misclassif led  samples.  A different  criterion  function 
that  involves  all  the  samples  is 


Jg(w)  = ||  X w - b||  2 (4.10) 

where  the  components  of  b are  ail  positive.  The 

minimization  of  J (w)  depends  on  the  value  of  b.  It  b is 

s 

fixed  arbitrarily  there  is  no  guarantee  that  the  solution 

will  give  v separating  vector  in  the  linearly  separable 

case.  To  avoid  that,  b and  w are  allowed  to  vary  in  the 

minimization  procedure.  This  is  the  basic  concept  of  the 

Ho-Kashyap  algorithm.  Another  approach  to  solve  the 

inequalities  in  Eq.  4.fc  is  to  use  linear  programming 

procedures.  Details  of  these  procedures  and  analysis  of 

the  other  previous  methods  are  given  in  references  |4,  2U|. 

60 


In  order  to  use  any  of  the  previous  methods  in  the 
design  of  edge  detectors,  two  conditions  for  the  resulting 
vector  w are  required.  First,  if  the  training  patterns  are 
separable,  the  training  procedure  should  converge  to  a w 
which  classifies  the  patterns  correctly.  Second,  if  the 
training  patterns  are  not  separable,  a case  which  is 
usually  encountered  in  edge  detection,  the  training 
procedure  should  detect  the  nonseparability  and  yield  a 
solution  which  can  be  used  practically.  These  two 
conditions  are  achieved  only  by  the  Ho-Kashyap  algorithm 
[21],  and  by  a linear  programming  procedure  that  minimizes 
the  perceptron  criterion  function  [22].  Any  of  these  two 
methods  can  be  used  in  edge  detector  design.  The 
performance  of  each  method  will  depend  on  the  distribution 
of  the  classes.  A comparison  between  the  two  methods  is 
outside  the  scope  of  this  dissertation.  Therefore,  in  the 
following  section  a discussion  of  one  of  them,  the 
Ho-Kashyap  algorithm,  and  its  application  in  edge 
detection,  is  given.  A similar  anaysis  can  be  developed 
for  the  linear  programming  procedure. 

4.2  The  Ho-Kashyap  Algorithm 

In  this  algorithm,  the  solution  of  the  inequalities  in 
Eq.  4.6  has  been  reformulated  as  a problem  of  finding  w and 
b > 0 such  that  Jg(w)  in  Eq.  4.10  is  minimized.  The 
minimizations  can  be  achieved  by  a steepest  descent 
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procedure  that  implements  the  gradient  functions 


° e T 

W = ” - b) 


(4.11a) 


= b - X w 


(4.11b) 


Since  there  is  no  constraint  on  w, -g^- 


= 0 implies 


w = (XTX)  1XTb 


= X#b 


(4.12) 


where  X is  the  pseudo  inverse  of  X.  Since  all  the 
components  of  b are  constrained  to  be  positive,  this  vector 
must  be  varied  in  such  a manner  to  never  violate  this 
constraint.  This  can  be  accomplished  by  letting 


b (k+1)  = b (k)  + 6b(k) 


(4.13) 


where 


6b  (k)  = c (e  (k)  + | e (k)  | ] 


(4.14a) 


e(k)  = X w(k)  - b(k) 


(4.14b) 


In  Eqs.  4.13  and  4.14,  k denotes  the  iteration  index,  c is 
a positive  correction  increment,  and  |e(k)|  indicates  the 
absolute  value  of  each  component  of  the  error  vector  e(k) 

1 2 0 J . From  Eqs.  4.12  and  4.13, 
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w (k+1)  = w (k)  + X#<$b(k)  (4.15) 

Thus,  Eq.  4.1(J  can  be  minimized  through  the  iteration 

w(l)  = X#b(l)  (4.16) 

e(k)  = X w(k)  - b(k)  (4.17) 

w(k+l)  = w (k)  + cX# [e (k)+|e (k) | ] (4.18) 

b(k+l)  = b(k)  + c [e (k) + |e (k) | ) (4.19) 

where  b(l)  > 0 but  otherwise  is  arbitrary,  and  c is  a 
constant  such  that  0 < c < 1. 


If  the  patterns  are  separable,  Eqs.  4.17  to  4.19  can 
be  repeated  until  all  components  of  e(k)  converge  to  zero, 
or  to  any  reasonably  small  value.  On  the  other  hand,  if 
the  components  of  e(k)  cease  to  be  positive,  but  are  not 
all  zero,  at  any  iteration  step,  this  will  indicate  chat 
the  classes  are  not  separable  [20,  21).  These  two 
characteristics  of  the  Ho-Kashyap  algorithm  are  important, 
especially  when  the  algorithm  is  used  to  design  edge 
detectors.  Because  the  degree  of  separability  of  the 
classes  of  edges  and  no  edges  changes  for  different  image 
models,  the  procedure  used  in  the  edge  detector  design 
should  be  able  to  handle  both  separable  and  nonseparable 
patterns. 
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4.3  Application  of  the  Ho-Kashyap  Algorithm  to  Edge 
Detection 

The  Ho-Kashyap  algorithm  is  used  in  the  design  of  edge 

enhancement/thresholding  operators.  In  this  experiment, 

patterns  of  vertical  edges,  and  patterns  of  no  edges,  are 

generated.  Gaussian  noise  is  added  to  produce  edge 

prototypes  with  SNR  = 1.0  or  10.0.  The  outputs  of  the 

different  edge  detectors  in  the  case  of  a vertical  edge 

{A^ ,A2» . . . ,A  } , and  in  the  case  of  no  edge 

{A„  ,Am, ,A^„} , are  used  to  construct  the  augmented 
N+l  N+2  2N  _ 

mat  r ix 

X = 

The  number  of  patterns  of  each  class  is  chosen  to  be 
N * 20.  This  ensures  that  the  performance  on  design  and 
test  data  will  be  similar  (4).  The  initial  components  of 
b(l)  are  chosen  to  be  unity,  and  iteration  given  by 
Eqs.  4.17  to  4.19  is  repeated  up  to  500  times.  The 

experiment  is  ended  if  the  components  of  e(k)  are  all  less 
than  a small  value,  (0.001),  or  if  nonseparability  is 
proved.  It  is  sometimes  useful  to  end  the  iteration  when 
the  threshold  t = -w(2)/w(l)  stabililzes  within  a 
relatively  small  variation. 


-an 

N+l 

• 

JL 

-1 

• 

• 

__A2N 

• 

-1 

(4.20) 
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After  the  training  phase  is  finished,  the  values  of  w 

* 

obtained  are  tested  with  a new  set  of  250  prototypes  * 

generated  with  the  same  model.  The  probability  of 
detection  in  the  case  of  an  edge,  and  the  probability  of 

1 

false  detection  in  the  case  of  a no  edge,  are  calculated. 

The  results  obtained  are  compared  with  the  theoretical  I 

results  derived  in  Chapter  3.  These  results  are  given  in 

Table  4.1  for  different  edge  detectors  with  vertical  and 

ti/4  edges  and  SNR  = 1.0  and  10.0,  respectively.  It  should  1 

be  noticed  that  in  many  cases  the  edge  detector  threshold  t 

converges  to  a value  which  results  in  equal  probabilities  | 

of  error 

PF  ~ 1~PD  (4.21) 

This  satisfies  the  Bayes  minimum  error  criterion  if  edges 
and  no  edges  are  equally  probable.  Thus,  the  results 
obtained  with  the  Ho-Kashyap  algorithm  have  practical 
signi  f icance. 

4.4  Conclusion 

In  this  chapter,  it  has  been  shown  that  edge  detectors 
can  be  designed  using  pattern  classification  techniques. 

As  an  example,  the  Ho-Kashyap  algorithm,  was  used  to  design 
different  edge  enhancement/thresholding  operarors.  The 
edge  model  used  was  an  ideal  edge  plus  Gaussian  noise. 

This  model  helps  in  comparing  the  experimental  results  with 
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the  theoretical  o 

nes  obtained 

in 

Chapter  3. 

The 

same 

technique  can  be 

easily  extended 

to 

the  desiqn 

of  any 

edge 

detector  with  any 

arbitrary  noise- 

mod 

el. 
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Chapter  5 

Figure  of  Merit  Comparison  of  Edge  Detectors 

The  methods  introduced  in  the  previous  two  chapters 
can  be  used  in  both  the  evaluation  and  the  design  of  edge 
detectors.  In  this  chapter,  a third  method  which  can  be 
used  only  in  the  evaluation  of  edge  detectors  performance, 
is  introduced.  The  procedure  used  in  this  chapter  can  be 

summarized  as  follows.  First,  an  artificial  test  image  is 

generated.  Second,  an  edge  detector  is  applied  on  this 
test  image.  Third,  the  quality  of  the  resulting  edge  map 
is  measured  in  terms  of  a scalar  function.  That  function 
can  be  considered  as  a figure  of  merit  of  the  corresponding 
edge  detector.  The  figure  of  merit  used  should  be 
sensitive  to  the  different  expected  errors  so  that  it  is 
maximum  when  the  edge  map  is  perfect,  and  decreases  as  the 
error  in  the  edge  map  increases.  Methods  based  on  the 
previous  technique  have  been  introduced  by  Fram  and  Deutsch 
1 1 J 1 , and  by  Pratt  |1).  This  latter  method  has  two 

advantages:  it  weights  the  different  errors  according  to 

their  importance;  and  it  allows  each  edge  detector  to  be 
tuned  to  its  best  capabilities,  which  guarantees  a fair 
comparison.  Because  of  these  advantages,  the  experiments 
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discussed  in  the  tollowing  sections  will  be  based  on  the 
figure  of  merit  developed  by  Pratt.  Section  b.l  explains 
the  basic  ideas  of  this  technique.  Section  5.2  summarizes 
the  results  obtained  with  simple  test  images.  Section  5.3 
introduces  conclusions  based  on  the  results  of  Chapters  3,4 
and  5 . 

5.1  Figure  cf  Merit  Concepts 

The  procedure  introduced  by  Pratt  utilizes  a test 
image  consisting  of  a 64  x 64  pixels  array  over  a 0 to  255 
amplitude  range  with  a vertically  oriented  edge  of  variable 
contrast  and  slope  placed  at  its  center.  Independent 
Gaussian  noise  of  standard  deviation  o is  added  to  the 
edge  image,  and  the  resultant  picture  is  clipped  to  the 
maximum  display  limits.  As  in  the  previous  chapters,  the 
s ignal-to-noise  ratio  is  defined  as 


«•»-  (£)2 

where  h is  the  edge  height. 


(5.1) 


When  an  edge  detector  is  applied  rn  this  test  imagp, 
three  major  types  of  error  will  affect  the  resulting  edge 
map:  (a),  missing  of  valid  edge  point;  (b) , failure  to 
localize  edge  points;  (c) , classification  of  noise  pulses 
as  edge  points.  Examples  of  these  errors  are  shown  in 
Figure  5.1. 
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N 


N* 

\ ■ N 

N = even  integer 


a)  vertical  edge  test  ima^e 


a 

B 


b)  ideal 


d)  offset 


c)  fragmented 


e)  smeared 


Figure  5.1.  Types  of  edge  detection  errors 


The  quality  of  the  resulting  edge  map  may  be  assessed 


by  the  ligure  ol  merit  delined  by 

T_ 

1 

F = 


max 


l V 1 

71 — TT  ^ 2 

ul'1A)  . . 1+ad^ 
i=l 


(5.2) 


where  Ij  and  represent  the  number  ol  ideal  and  actual 
edge  map  points,  respectively,  a is  a scaling  constant,  and 
d is  the  separation  distance  of  an  actual  edge  point  normal 
to  a line  ol  ideal  edge  points.  The  rating  factor  is 
normalized  so  that  F = 1 lor  a perfectly  detected  edge. 
The  scaling  factor  a may  be  adjusted  to  penalize  edges 
which  are  localized  but  offset  from  the  true  position. 
Normalization  by  the  maximum  of  the  actual  and  ideal  number 
of  edge  points  insures  a penalty  tor  smeared  or  fragmented 
edges.  This  figure  of  merit  gives  higher  rating  tor  a 
smeared  edge  than  tor  an  offset  edge.  This  is  reasonable 
because  it  is  possible  to  thin  the  smeared  edge  by 
post-processing  11). 


The  figure  of  merit  method  has  been  used  to  evaluate 
the  performance  of  the  Roberts,  Kirsch,  Sobel,  and  compass 
gradient-operators.  In  each  case,  t lie  thresholds  are 
chosen  to  maximize  the  figure  of  merit,  plots  of  these 
maximum  values  are  given  in  |lj.  The  results  obtained  in 
this  experiment  can  be  predicted  theoretically  using  the 
probabilities  of  detection  of  central  edges  Pp,  of 

displaced  edges  PD^g,  and  of  false  detection 
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detection  of 


Pp,  for  a given  edge  detector.  As  an  example,  it  a 3x3 
edge  detector  is  applied  to  the  test  image  shown  in 
Figure  5.1,  there  will  be  a central  edge  at  column  ^ + 1, 
displaced  edges  at  the  two  adjacent  columns,  and  no  true 
edges  elsewhere.  For  this  case,  Eq.  5.2  reduces  to 


where 

IN  = max{N, [PD+2PDis+(N-3)PF]N}  (5.4) 

The  analysis  introduced  thus  tar  is  based  on  a test 
image  that  contains  a vertical  edge.  The  same  analysis  can 
be  extended  to  other  image  models,  but  in  these  cases  the 
evaluation  of  Eq.  5.2  will  become  more  difficult.  Another 
test  image  which  is  relatively  easy  to  analyze  is  one  that 
contains  a diagonal  edge.  As  has  been  shown  in  Cnapter  2, 
the  results  obtained  from  the  vertical  and  the  diagonal 
edge  models  are  sufficient  to  determine  edge  detector 
performance . 

A test  image  that  contains  a diagonal  edge  is  shown  in 
Figure  5.2.  The  image  consists  of  12b  x 12tt  pixels 
generated  with  the  same  signal  and  noise  models  used  in  the 
test  image  that  contains  the  vertical  edge.  To  simplify 
the  comparison  of  the  results  obtained  in  both  cases,  only 
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Figure  5.2.  Figure  of  merit  test  image 
geometry  for  diagonal  edge 
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the  central  part  of  the  diagonal  edge  is  used  in 
calculating  the  figure  of  merit.  This  central  region  is 
shown  bounded  by  dotted  lines  in  Figure  5.2.  The  number  of 
edge  pixels  in  this  region  is  chosen  to  be  equal  to  the 
number  of  edge  pixels  in  the  vertical  edge  model.  But,  the 
number  of  non-edge  pixels  in  the  diagonal  edge  model  is 
twice  their  number  in  the  vertical  edge  model.  The  effect 
of  this  difference  is  compensated  by  scaling  the  diagonal 
distance  d by  a factor  /2.  The  results  obtained  with  these 
two  test  images  will  be  given  in  the  following  section. 

5.2  Experimental  Results 

The  Sobel,  Prewitt,  compass  gradient,  Kirsch,  3-level 
and  5-level  operators  are  evaluated  using  the  figure  of 
merit  defined  previously.  The  test  images  are  generated  in 
the  form  of  ideal  steps  with  vertical  or  diagonal 
orientations.  The  height  is  h = 25.  Gaussian  noise  is 
added  to  the  ideal  step  with  s ignal- to-noise  ratios  1.0, 
5.0,  10. 0,  20.0,  100.0,  respectively.  Each  edge  detector 
is  applied  on  the  different  test  images,  and  the  threshold 
t is  varied  untill  the  figure  of  merit,  is  maximum.  Plots 
of  the  figure  of  merit  as  a function  of  s ignal-to-noi se 
ratio  ere  shown  in  Figures  5.3  and  5.4.  The  figures  of 
merit  generally  follow  expected  trends:  small  for  low 
signal-to- noise  ratios  and  large  in  the  opposite  case. 
Some  of  the  edge  detection  methods  are  superior  to  others 
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simple  differential  operators 
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Figure  5.3.  (Continued) 


I 


0 

4-1 
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template  matching  operators 
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Figure  5.4.  (Continued) 


for  all  test-  images.  Examples  of  the  edge  maps,  obtained 
in  the  previous  experiments,  are  shown  in  Figure  5.5.  It 
should  be  noticed  that  the  figures  of  merit  are  correlated 
with  visual  quality  of  the  edge  maps. 

The  figures  of  merit  plotted  in  Figures  5.3  and  5.4 
can  be  related  to  the  response  of  an  edge  detector  to 
displaced  edges,  shown  in  Figure  2.1U  , and  to  the 
operating  characteristics  of  an  edge  detector,  as  shown  in 
Figures  3.2  and  3.3.  The  figure  of  merit  is  large  when  the 
edge  detectors  have  good  performance  in  the  presence  of 
noise,  and  when  the  edge  detectors  suppress  non  central 
edges  efficiently. 

5.3  Conclusion 

In  general,  the  results  obtained  in  Chapters  3,  4 and 
5 show  that  the  3-level  operator  has  better  performance 
than  any  of  the  other  edge  detectors.  Its  performance  can 
be  compared  only  to  the  performance  of  the  Prewitt 
operator.  The  advantage  of  the  3-level  operator  is  that  it 
has  almost  the  same  performance  tor  all  edge  orientations, 
while  the  advantage  of  the  Prewitt  is  that  it  requires  less 
computation  effort,  especially  if  the  square  root  is 
replaced  by  the  sum  of  magnitudes. 


a)  Prewitt  square  root  b)  Prewitt  square  root 

vertical  edge,  SNR=1  vertical  edge,  SNR=10 


c)  Prewitt  square  root  d)  Prewitt  square  root 

vertical  edge,  SNR=100  diagonal  edge,  SNR=10 

Figure  j.5.  Edge  maps  for  2x2  and  3x3  operators 
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e)  Sobel  square  root 

vertical  edge,  SNR=10 


g)  3- level 

vertical  edge,  SNR=10 


f)  Roberts  square  root 
vertical  edge,  SNR=10 


h)  Kirsch 

vertical  edge,  SNR=10 


Figure  5.5.  (Continued) 
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Chapter  6 

New  Edge  Enhancement/Threshoiding  Methods 

The  analysis  introduced  so  far  has  been  concerned  with 
the  evaluation  of  existing  edge  detection  operators.  This 
evaluation  is  one  of  two  objectives  of  the  dissertation. 
The  other  objective  being  to  introduce  new  edge  detection 
techniques  and  to  evaluate  their  performance.  In  this 
chapter,  some  new  trends  in  edge  enhancement/thresholding 
are  given.  In  Chapter  7,  a new  edge  fitting  algorithm  is 
discussed. 

There  are  some  modifications  that  can  be  introduced  to 
the  edge  enhancement/ thresholding  operators,  such  as 
changing  the  mask  size,  weighting  che  mask  elements,  and 
using  an  adaptive  thresholding  procedure.  Before 
introducing  these  modifications,  it  is  useful  to  evaluate 
their  effects  and  to  decide  if  they  actually  improve  the 
edge  detector  performance.  This  will  be  the  subject  of  the 
following  sections.  In  Section  6.1,  the  effect  of 
increasing  the  mask  size  is  evaluated.  In  Section  6.2,  the 
effect  of  weighting  the  mask  elements  is  discussed.  In 
Section  6.3,  some  adaptive  edge  thresholding  methods  are 
introduced . 
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6.1  Effect  of  Changing  Mask  Size 


i 


The  3x3  edge  detectors  can  be  considered  as  a special 
case  of  general  (2K+1)  x (2K+1)  edge  detectors.  Extension 
of  the  two  masks  of  the  Prewitt  operator,  is  shown  in 
Figures  6.1a  and  b.  Also,  the  set  of  four  masks  of 
Figure  6.1  represent  an  extension  of  the  3-level  operator. 
Increasing  the  mask  size  will  affect  edge  detector 
performance  in  two  ways.  First,  the  operator  will  be  less 
sensitive  to  noise  because  it  bases  its  decision  on  a 


d)  negative  diagonal 

i 


c)  positive  diagonal 


Figure  6.1.  Extended  masks  for  the  Prewitt 
and  the  3-level  operators 
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In  the 

case  of  the  Prewitt 

operator , 

the  output 

of 

the 

vertical  and  horizontal 

masks 

are 

independent 

Gaussian 

random 

vector 

variables,  with  cova 

Gy,  in  the  form 

r iance 

mat 

r ix  ^ t and 

the 

mean 

2K (2K+1 ) 0 

0 2K (2K+1) 


(6.3) 


Gv  = h [K (2K+1)  0]T  (6.4) 


The  probabilities  of  detection  and  false  detection  can  be 
evaluated  as  in  Chapter  3.  Plots  of  the  edge  detector 
operating  characteristics  tor  a signal-to-noise  ratio  of 
1.0,  and  operator  mask  sizes  of  5x5,  7x7,  and  9x9  are  given 
in  Figure  6.2.  From  these  piots,  it  is  clear  that  the 
performance  of  the  3-level  operator  is  better  than  the 
performance  of  the  Prewitt  operator  tor  diagonal  edges, 
while  it  is  slightly  less  than  the  performance  of  the 
Prewitt  operator  for  vertical  edges.  Also,  it  can  be 
easily  noticed  that  performance  improves  as  the  mask  size 
increases.  On  the  other  hand,  increasing  the  mask  size 
will  reduce  the  edge  detector  resolution.  This  effect  can 


be  shown  by  plotting  edge  detector 

output  as 

a tunct 

i on  of 

the  distance  between 

the  edge 

and  the 

center 

of  the 

operator.  Plots  of  the 

normal lzed 

outputs 

of  3x3 

, 5x5, 
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Figure  6.2. 


3 

Prewitt  vertical 
Prewitt  diagonal 


-I 1 1 1 i i f i l, 

20  40  60  so  100 

Probability  of  detection  versus  probability 
of  false  detection  for  extended  Prewitt  and 
3-level  operators 
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7x7,  and  9x9  mask  operators,  in  the  case  of  a vertical 
edge,  are  shown  in  Figure  6.3.  It  is  clear  that,  as  the 
mask  size  increases,  the  region  over  which  the  edge  is 
detected  increases.  This  will  reduce  the  operator's 
ability  to  detect  the  finer  details  of  the  image. 

The  previous  two  effects  can  be  measured 
simultaneously  by  using  the  figure  of  merit  defined  in 
Chapter  5.  The  3-leve-i  and  the  Prewitt  operators  are 
applied  on  the  test  images  containing  a vertical  and  a 
diagonal  edge.  The  figure  of  merit  is  plotted  as  a 
function  of  the  signal-to-noise  ratio.  These  curves  are 
shown  in  Figure  6.4.  The  results  agree  with  the  previous 
analysis:  for  low  signal-to-noise  ratio,  the  operators  with 
large  mask  size  have  better  performance  because  they  are 
less  sensitive  to  noise,  which  is  a dominant  factor  in  this 
case,  while  for  large  signal-to-noise  ratio,  the  operators 
with  small  mask  size  have  better  performance  because  they 
are  more  accurate  in  detecting  edge  location.  Examples  of 
the  edge  maps  for  the  vertical  edge  with  SNR  * 1.0  are 
shown  in  Figure  6.5.  These  examples  give  a visual 
indication  of  the  improvement  achieved  by  increasing  the 
mask  size. 

Since  the  3-level  and  the  Prewitt  operators  achieve  an 
almost  optimum  performance  while  using  simple  computation 
procedures,  the  performance  of  these  operators  can  be  used 
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ure  6.4.  (Continued) 
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b)  7x7  mask 


Figure  6.5.  Edge  maps  for  extended  Prewitt 
operator,  vertical  test  image 
with  5NR= l 


as  a standard  to  which  any  other  edge  detector  performance 
should  be  compared.  As  an  example,  the  performance  of  the 
convential  69  pixel  Hueckel  operator  is  compared  with  the 
performances  of  a 7x7  or  a 9x9  mask  operators  in  Appendix 
A.  This  comparison  indicates  that  the  3-level  and  the 
Prewitt  operators  has  better  performances  than  the  Hueckel 
operator . 

6.2  Use  of  Weighted  Masks 

The  resolution  of  edge  detectors  with  large  mask  size 
can  be  improved  by  weighting  the  mask  elements,  such  that 
they  are  maximum  near  the  mask  center  and  decrease  to  zero 
as  they  approach  the  mask  periphery.  There  are  many 
examples  of  weighted  masks  that  can  be  used  in  edge 
detection.  Argyle  123]  has  proposed  a split  Gaussian 
function  defined  in  one  dimension  as 

— - — exp  ( - — — ir  ^ x > 0 

m k V 2k2/ 

0 x = 0 (6.5) 

— — exp  ( - ^ x < 0 

SZvk  \ 2k2  / 

wher  ; k is  ;t  spread  constant.  Macleod  124)  introduced  a 
continuous  Gaussian  function;  a special  case  of  the  Macleod 
function  is  given  by 

H(x,y)  = exp^-^  |expj^‘(^£)2j'exp^-(^£)2j  j (6.6) 

where  p anu  t are  spread  constants.  Another  example  of  the 
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weighting  functions  is  the  polynomial 

X 5 0 

1+ay  1+ax 

H(x,y)  = j 0 x = 0 (6.7) 

— * < 0 

1+ay ‘ 1+ax 

where  a is  an  adjustable  scaling  factor.  The  elements  of 
the  previous  weighted  masks  are  not  integers,  and  thus 
require  more  computation  tine  compared  with  the  3-level 
simple  mask.  This  problem  can  be  avoided  if  the  weighted 
mask  is  chosen  to  be  the  pyramid  shaped  mask  shown  in 
Figure  6.6. 


To  test  the  resolution  of  the  different  weighted 
masks,  the  outputs  of  7x7  weighted  mask  operators  for 
displaced  vertical  edges  are  plotted  in  Figure  6.7.  In 
this  experiment,  k * p = t * 4.0*  and  a=  1/9.  The  results 
show  that  the  pyramid-shaped  mask  has  the  best  resolution 
followed  by  the  polynomial,  the  Argyle,  the  simple  3-level, 
and  finally  the  Macleod  weighted  mask. 


The  statistical  model  of  Chapter  3 can  be  used  to 
evaluate  the  performance  of  the  weighted  mask  operators. 
As  an  example,  tor  the  weighted  Prewitt  operator,  the 
performance  will  depend  on  the  ratio  between  the  ideal  edge 
output  (a),  and  the  noise  standard  deviation  (°r)*  The 

•These  are  the  parameters  suggested  by  Fram  and  Deutsch  in 
their  paper  | 1 3 ] . 
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Figure  6.6.  Pyramid  operator 
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Figure  6.7.  Edge  gradient  amplitude  response  as  a 
function  of  edge  displacement  for 
weighted  7x7  operators 
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larger  this  ratio,  the  better  the  performance.  In 
Table  6.1,  the  values  of  a/o^  tor  the  7x7  weighted  mask 
operators  are  given.  These  ratios,  and  hence  the 
performance  of  the  weighted  mask  operators  depend  on  the 
shape  of  the  weighting  function  used.  In  general,  the  edge 
detector  will  have  a better  performance  in  the  presence  of 
noise  it  the  mask  elements  are  more  uniform,  with  the 
optimum  performance  achieved  by  using  equal  mask  elements. 

The  different  weighted-mask  edge  detectors  can  be 
evaluated  using  the  Mgure  of  merit  of  Chapter  b.  In  this 
experiment,  the  vertical  edge  test  image  is  used  to 
evaluate  the  Argyle,  Macleod,  polynomial  and  pyramid  shaped 
operators  with  a mask  size  7x7.  Results  are  shown  in 
Figure  6.b.  It  is  clear  that,  excluding  the  Macleod 
operator,  most  of  the  weighted  mask  operators  have 
approximately  identical  performances.  The  interior 
performance  of  the  Macleod  operator  can  be  improved  by 
changing  its  parameters. 


6.3  Use  of  Adaptive  Thresholding 

In  the  previous  experiments,  the  value  of  the 

threshold  t was  found  to  be  a function  of  the  absolute 

signal  levels  and  the  signal-to-noise  ratio.  In  simple 

test  images,  t can  be  a constant  tor  all  the  image 

subregions.  In  real  world  images,  however,  a constant 

threshold  should  not  be  used  because  it  will  enhance  the 
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Table 


pyramid,  polynomial 


I 


boundaries  between  high  intensity  regions  more  than  the 
boundaries  between  low  intensity  regions.  This  problem  can 
be  avoided  if  the  output  of  the  edge  detectors  is  compared 
with  a function  of  the  subregion  intensities.  This  can  be 
considered  as  a local  adaptive  thresholding  procedure  1 7 J . 
Examples  of  the  functions  that  can  be  used  are  the  average 


J 


the  root  mean  square 


(6.8) 

i 


t 

and  in  general 


(6.9) 


t = 

In  Eqs.  6.8  to  6.1U.,  are  the  pixels 

intensities,  and  a^» a2  are  constants  that  can  be  adjusted. 

A quantitative  evaluation  of  these  adaptive 
thresholding  methods  is  not  simple  because  it  requires  the 
knowledge  of  the  image  model.  A discussion  of  the  problem 
will  be  given  in  Chapter  8.  Some  ot  the  experimental 
results  obtained  with  the  adaptive  thresholding  edge 
detectors  will  be  shown  in  Appendix  e. 


6.4  Conclusion 
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In  this  chapter,  various  ir.oditi  cat  ions  in  the  edge 
enhancement/ thresholding  operators  have  been  considered. 
The  purpose  ot  these  changes  is  to  achieve  a compromise 
between  better  resolution  and  acceptable  performance  in  the 
presence  ot  noise.  It  is  believed  that  this  compromise 
should  be  one  ot  the  basic  objectives  in  edge  detector 
design.  Other  methods  that  acnieve  better  edge  resolution 
through  edge  thinning  can  be  tound  in  the  works  ot 
Rosenfeld  l 5 , 2 b J . and  Herskovits  l 2 b ] . 


Chapter  7 

A New  Edge  Fitting  Algorithm 


Minimum-error  surface  fitting  techniques  have  been 
considered  by  many  as  an  optimum  solution  to  the  edge 
detection  problem.  Although  this  is  true  theoretically,  in 
practical  applications,  the  surface  fitting  algorithms 
suffer  from  two  drawbacks.  The  first  is  that  the  image  is 
usually  defined  over  a sampled  domain  while  most  of  the 
surface  fitting  algorithms  are  derived  for  continuous 
functions.  The  second  is  that  even  assuming  the  image  to 
be  continuous,  the  optimization  procedures  require  the 
solution  of  implicit  functions  of  the  edge  parameters. 
This  solution  can  be  achieved  through  iterative  procedures, 
which  are  time  consuming  and  thus  cannot  be  practically 
used  in  edge  detection.  Usually  some  approximations  are 
made  to  avoid  this  iterative  solution.  As  an  example,  in 
the  Hueckel  operator,  the  optimization  procedure  is 
simplified  by  using  truncated  Fourier  expansions  of  the 
image  subregion  and  the  ideal  edge  model.  The  effect  of 
this  approximation  on  the  optimality  of  the  solution  cannot 
be  easily  evaluated. 
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The  previous  difficulties  can  be  avoided  by  using  edge 
fitting  algorithms  based  on  the  discrete  image  model.  One 
of  these  algorithms  will  be  introduced  in  the  following 
sections.  In  Section  7.1,  a one -dimensional  edge  fitting 
algorithm  is  discussed.  In  Section  7.2,  the  model  is 
extended  to  the  more  important  case  of  two-dimensional  edge 
fittinq.  In  Section  7.3,  evaluation  of  the  edge  fitting 
algorithm  performance  is  given. 

7.1  One-Dimensional  Edge  Fitting 


The  problem  of  one-dimensional  edge  fitting  can  be 

stated  as  follows:  given  a continuous  function  f(x)  defined 

for  -b  < x b,  it  is  required  to  find  a piecewise  linear 

function  s (x)  such  that  the  error 

E 


-f: 


(s-Jx)  - f (x)  ) dx 
b £ 


(7.1) 


is  minimum.  The  problem  can  be  simplified  by  assuming  that 
the  function  s^lx)  is  centered  around  the  origin,  as  shown 
in  Figure  7.1.  In  this  case  s^(x)  is  given  by 


s (x)  = 

p 


a-Ax, 


a+Ax 


-b  < x < - x , 


-X0  1 x<x0 


(7.2) 


a+AXp  Xg  < x < b 


where  a is  the  average  value  of  s^(x) , A is  the  ramp  slope, 
and  x q is  the  half  ramp  width.  These  three  parameters  are 
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combined  in  the  vector 


p = [ a A x0]T  (7.3) 

The  value  of  p that  minimizes  Eq.  7.1  is  obtained  by 
solving  the  set  of  equations 


3E 

Tfa 


0 


(7.4a) 


3E 

JE  = 


0 


(7.4b) 


0 


Substituting  in  the  previous  equations, 
parameter  vector  p is  given  by 


a 


f b 

f (x) dx 

> -b 


(7.4c) 
the  optimum 


(7.5a) 


f 0 xf(x)dx  = JAXgb  + yAXg  (7.5b) 

X _ 

f(x)dx  = AXg(b-xQ)  (7.5c) 

0 

It  is  clear  that  even  for  this  simplified  case,  the 
solution  is  based  on  implicit  functions  of  xQ  and  &. 
Instead  of  solving  Eqs.  7.5b  and  c through  an  iterative 
procedure,  it  has  been  found  that  reformulating  the  problem 
in  the  discrete  domain  will  save  computation  time,  while 
giving  a solution  that  is  feasible. 


f-x  b 

0 

f (x)dx  - 
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In  tne  discrete  domain,  the  functions  t(x)  and  s (xj 
are  defined  only  tor  the  set  of  points  { -N , . . . , 0 , . . . , N } . 
In  all  of  the  following  discussions,  the  ramped  part  of 

s (x)  is  assumed  to  start  and  end  at  sample  points  -n  and  n 

E 

respectively.  This  assumption  simplifies  the  computation 
without  a substantial  change  in  the  accuracy  of  the 
results.  The  curve  fitting  procedure  reduces  to  finding 
the  parameter  vector 

£ = [ a A n ]T  (7.6) 


such  that  the  error 

N 

E = 2 |Sn(1>  - f<i>)2  (7.7) 

E i“N  E 

is  minimum.  Since  n assumes  a finite  number  of  integer 
values,  the  minimization  problem  can  be  solved  by  repeating 
the  computation  for  each  value  of  n and  choosing  the  value 
of  n that  minimizes  E . In  addition,  by  differentiating 
with  respect  to  «,  it  can  be  shown  that  tor  any  value  of  n, 
the  optimum  a is  independent  of  n and  is  given  by  the 
average 

N 

3 ” TijlT  E Hi)  (7.8) 

1 t=-N 


Substituting  the  values  of  s^(i)  in  Eq.  7.7 
the  terms,  E can  be  expressed  in  the  form 


and  arranging 


♦ CXA  + C2A 


(7.9) 
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where 


N 


= £ (a-f  (i) ) 2 


-N 


(7.10a) 


and 


-(n+1)  n N 

C.  = 2n  £ f(i)-2  £if  (i)-2n£f  (i)  (7*10b) 

1 -N  ‘ -n  n+1 


-(n  + 1)  n N 

C,  - E »2  ♦ E i2  + S n: 

-N  -n  n+1 

Equation  7.9  can  be  minimized  by  choosing 


(7.10c) 


A = - 


2C. 


and  for  this  value  of  a,  is  given  by 


(7.11) 


EE  = C0  - *CT 


(7.12) 


One-dimensional  edge  fitting  can  be  achieved  by  the 

following  procedure:  given  a function  f(i)  defined  over  the 

range  |-N,N],  the  average  (a)  is  computed  using  Eq.  7.8. 

Assuming  that  t(i)  can  be  fitted  to  a ramp  s (i)  with  width 

£ 

n,  the  optimum  value  of  A and  the  corresponding  minimum 

error  E are  computed  using  Eqs.  7.11  and  7.12.  The 

computation  is  repeated  for  different  values  of  n,  and  the 

minimum  error  in  each  case  is  compared.  The  values  of  n 

and  A that  result  in  a global  minimum  error  are  chosen  as 

the  edge  parameters.  Finally,  the  acceptance  of  the  edge 
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fitting  can  be  determined  from  the  signal-to-noise  ratio, 
2 

A /E  . . If  this  ratio  is  larger  than  a threshold  t,  the 
mm 

edge  fitting  is  accepted. 

7.2  Two-Dimensional  Edge  Fitting 

The  previous  analysis  can  be  extended  to 
two-dimensional  edge  fitting.  In  this  case,  the  image 
function  f ( i , j ) defined  over  a subregion  is  compared  with 
an  ideal  edge  model  S (i,j),  where 

£=[a  8 ^ A n]T  (7.13) 

is  the  parameter  vector.  The  variables  a,  8^,  A and  n are 
defined  as  in  Section  7.1  where  8 indicates  the  edge 

l 

orientation.  In  the  following  experiments,  8 assumes  one 

i 

of  four  basic  orientations,  horizontal,  vertical  and  the 
two  diagonals.  The  effect  of  this  approximation  on  thf 
accuracy  of  the  edge  fitting,  will  be  discussed  in 
Section  7.3.  The  edge  fitting  is  achieved  by  changing  the 
edge  parameter  vector  p to  minimize  the  error 

N N V 2 

En  = E £ (s  (i,j)-f(i,j)J  (7.14) 

t i=-N  j=-N\  E 

Following  the  analysis  in  Section  7.1,  it  can  be  shown 
that,  for  the  minimum  error,  the  parameter  a is  given  by 

a = L_  £ £f(i,j)  (7.15) 

(2N+1)  i j 
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The  parameters  9^  and  n can  be  changed  in  finite  steps,  and 

for  each  combination  of  9,  and  n,  the  error  E is  in  the 

1 E 

form 

F.  = Cn  + C.A  + C-A 2 (7,16) 

pul  i 

where 


and 


YY  (a-f  (i,  j ) 

i j ' 


(7.17) 


- (n+1)  n N 

C.  = 2n  Y F (i)  - 2 Y iF(i)  - 2n  E F(i)  (7.18a) 
i=-N  i=-n  i=n+l 


[ 


C2  = (2N+1)  | 2 (N-n) n + 


JL  1 

Ei2 

i=-n  J 


(7.18o) 


for  vertical  and  horizontal  ramps,  while 
-(n+1)  -1 


C,  = 2n  Y lF(i)+F(i+i) ]-  Y liF  ( i ) + ( i+i)  F (i+i)  ] 
1 i=-N  1 i=-n 


• • 

-E 

i=l 


N 


liF (i) + (i-i) F(i-i) 1 -n  Y IF (i) +F (i-i) 1 


i=n+l 


(7.19a) 


C ..  = 2 (N-n)  (2  (N-n)+l]ri2  + 2 Yj  ll2(N-i)+lJi2+(2(N-i)+2]  (i-I)2[ 

imli  (7.19b) 

In  Eqs.  7.1b  and  7.19,  the  axis  is  taken  perpendicular  to 

the  edge  side,  and  F(i)  indicates  the  sum  of  all  the 

elements  at  distance  i from  the  edge.  Sketches  of  the 

masks  used  for  vertical  and  diagonal  edges  are  shown  in 
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is  the  same  for 


Figure  7.2.  Since  the  expression  of  E 
both  one-  and  two-dimensional  edge  fitting,  the  values  of 
and  E . are  still  given  by  Eqs.  7.11  and  7.12.  Thus, 
two-dimensional  edge  fitting  can  be  achieved  by  the  same 
procedure  described  in  the  previous  section.  The  only 
changes  are  that  the  computation  has  to  be  repeated  for  the 
different  6^  , and  that  the  values  of  Cq,  and  C2  are  now 
giver,  by  Eqs.  7.17  to  7.iy. 

The  number  of  computations  required  for  a 7x7  edge 
fitting  algorithm  is  273  additions  and  112  multiplications. 
This  can  be  compared  to  152  additions  and  1 multiplication 
needed  for  a 7x7  template  matching  operator.  The  effort 
needed  for  accessing  the  image  intensities  and  comparing 
the  masks'  outputs  is  the  same  for  both  operators.  The  CPU 
times  needed  by  a PDP-10  KL  processor  to  process  a 64x64 
image,  using  f-e  7x7  edge  fitting  algorithm  and  template 
matching  operator,  are  IB  and  lb  seconds  respectively. 

7.3  Performance  Evaluation 

The  performance  of  the  edge  fitting  algorithm  has  been 
evaluated  using  three  different  approaches.  First,  the 
output  of  the  edge  fittinq  operators  for  edges  with 
different  orientations  and  distances  from  the  center  are 
compared.  Second,  a preliminary  evaluation  of  the 
performance  tor  noisy  edges  are  given.  Third,  the  figure 
of  merit  tor  the  edge  fitting  algorithm  is  calculated. 
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In  the  first  approach,  edge  fitting  algorithms  with 

mask  sizes  5x5,  7x5  and  9x9  are  used  to  process  image 

subregions  containing  ideal  central  edges  with  variable 

orientation  and  ideal  vertical  edges  with  varying  distance 

from  the  mask  center.  Plots  of  /E  — / A for  the  previous 

mm 

two  cases  are  shown  in  Figures  7.3  and  7.4  respectively. 

In  these  curves,  the  abrupt  jumps  in  /E  ~/A  occur  when  A 

mm 

changes  suddenly.  This  occurs  when  the  width  (n)  of  the 

edge  model  that  fits  the  image  data  is  changed.  From 

Figure  7.3,  it  is  obvious  that  the  edge  fitting  algorithm 

is  not  isotropic;  the  algorithm  has  the  best  performance 

for  a vertical  edges,  it  is  less  sensitive  to  edges  with 

orientation  u/9  < 4>  < tt/6  , the  performance  begins  to 

improve  again  as  <}>  approaches  tt/4.  Also,  it  should  be 

noticed  that  the  output  for  tt/4  is  not  zero.  This  is 

because  the  edge  model  used  does  not  include  a diagonal 

step  which  corresponds  to  ramp  width  n * 1/2.  The  diagonal 

edges  with  fractional  ramp  width  were  excluded  to  save 

computation  effort,  and  to  keep  the  numbers  of  edge 

prototypes  equal  for  both  tne  vertical  and  the  diagonal 

edge  models.  The  curves  in  Figure  7,4  show  that  the  error 

/£'  . / A increases  sharply  as  the  edge  is  displaced  off 
mm 

center.  This  feature  prevents  the  multiple  detection  of 
the  same  edge  point.  The  threshold  of  the  edge  fitting 
algorithm  can  be  chosen  to  allow  the  detection  of  central 
edges  with  a specified  minimum  edge  height,  while 
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edge  orie 


Figure  7.3.  Edge  fitt 
as  a func 


0 Q5 


Figure  7.4.  Edge  fi 
as  a fu 


suppressing  displaced  edges.  Also,  it  should  be  noticed 
that  by  increasing  the  number  of  discrete  angles  (0^),  the 
edge  fitting  performance  will  become  more  uniform.  It 
seems,  however,  that  this  change  is  not  necessary,  because 
the  performance  of  the  edge  fitting  algorithm  with  four 
basic  orientations  is  sufficiently  accurate  for  all 
practical  applications. 


The  statistical  analysis  introduced  in  Chapter  3 can 
be  used  to  evaluate  the  edge  fitting  algorithm. 

Derivations  of  the  probability  density  functions  of  the 
coefficients  C , C and  C and  of  the  error  E , are 

0 1 fc  ^ 

straightforward.  These  derivations  are  not  needed, 

however,  because  as  a result  of  the  large  mask  sizes  used 

in  the  edge  fitting  algorithms,  the  noise  is  usually 

averaged  out.  The  decision  strategy  can  be  derived  from 

the  deterministic  analysis  given  previously.  To  prove  the 

validity  of  this  assumption,  the  values  of  /E  7/A  are 

min 

plotted  as  a function  of  the  edge  orientation  in  the  case 
of  a noisy  central  edge.  The  results  are  shown  in 
Figure  7.5.  The  edge  fitting  mask  is  7x7  and  the 

signal-to-nose  ratios  are  1.0,  10.0  and  100.0.  It  should 
be  noticed  that  for  practical  levels  of  SNR,  the  effect  of 
noie  is  negligible. 


The  edge  fitting  algorithms,  with  mask  sizes  5x5,  7x7 

and  9x9,  have  been  evaluated  using  the  figure  of  merit  of 
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— ideal  edge 


edge  or ientation,<£>,  degrees 


Figure  7.5.  Edge  fitting  normalized  error  ’^:rnin/A» 

as  a function  of  actual  edge  orientation 
for  noisy  edges 
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Chapter  5.  The  results  obtained,  for  the  vertical  and  the 
diagonal  test  images,  are  shown  in  Figure  7.C.  Examples  of 
the  edge  maps  for  SNR  = 1.0,  are  shown  in  Figure  7.7. 
Comparing  the  previous  results  with  the  results  obtained 
for  3-level  simple  operators  with  the  same  mask  sizes,  it 
can  be  noticed  that  for  small  mask  size  and  very  low  SNR, 
the  edge  fitting  algorithm  is  not  as  good  as  the  simple 
mask  operators.  This  observation  can  be  explained  by  the 
fact  that  the  edge  fitting  algorithm  bases  its  decision  on 
an  estimation  of  the  edge  parameters.  This  estimation  is 
sensitive  to  noise  especially  when  the  number  of  pixels 
used  is  small.  However the  edge  fitting  algorithm  has 
better  performance  for  high  SNR  and  for  large  mask  size. 
This  is  because  the  edge  fitting  algorithm  suppresses 
displaced  edges  efficiently.  The  edge  fitting  algorithm 
has  the  additional  advantage  of  being  less  sensitive  to 
changes  in  the  signal-to-noise  ratio  of  the  image.  This 
results  from  using  a decision  strategy  that  is  based  on  the 
normalized  fitting  error. 

7.4  Conclusion 

In  this  chapter,  a new  edge  fitting  algorithm  has  been 
introduced.  The  new  algorithm  is  derived  in  the  discrete 
domain,  this  allows  a direct  optimization  of  the  operator's 
performance.  The  performance  of  the  new  algorithm  is 
better  t ,an  chat  of  the  edge  enhoncement/thresholding 
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Figure  7.6.  Figure  of  merit  as  a function  of  signal-to-noise  ratio  for 
edge  fitting  operator 


Figure  7.7.  Edge  maps  for  the  edge  fitting 

operator,  diagonal  test  image  with 
SNR=1 
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operators  for  a wide  range  of  signal-to-noise  ratios. 


Chapter  8 


Conclusion  and  Further  Work 


This  chapter  summarizes  the  basic  findings  of  the 
dissertation,  and  discusses  the  subjects  that  will  need 
further  investigation. 

The  objective  of  this  work,  was  to  introduce  a 
quantitative  analysis  of  the  edge  detectors,  with  an 
emphasis  on  the  edge  detectors  as  local  opera  ors,  that  can 
be  used  to  preprocess  the  input  images f without  any  a 
priori  knowledge  of  the  images  contents.  The  tools  that 
have  been  used  in  this  analysis  are  the  statistical 
detection  theory  and  pattern  classification.  These 
concepts,  help  in  a better  understanding  of  the  edge 
detection  problem.  Numerical  ordering  of  the  performance 
of  the  local  edge  detectors,  was  achieved  by  introducing  a 
figure  of  merit  defined  for  specific  test  images.  New 
techniques  for  edge  detection,  including  a discrete  edge 
fitting  algorithm,  have  been  discsssed. 


There  are,  however,  more 
before  a complete  understand 
problem  is  achJeveo.  First, 


questions 
ing  of  the 
iri  the 


to  be  answered 
edge  detection 

case  of  the 
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dilterential  edge  detectors,  the  decision  is  based  on 
measurements  of  the  differences  along  two  perpendicular 
axes.  It  is  not  clear,  however,  that  combining  these  two 
differences  in  the  sum  of  squares  or  the  sum  of.  magnitudes, 
is  the  optimum  decision  strategy.  An  optimum  strategy  can 
oe  developed  it  the  probability  density  function  of  the 
edge  orientation  p ( <j> ) , is  known. 

Second,  in  all  the  previous  analysis  the  edges  are 
assumed  to  have  specific  orientations  and  heights.  This  is 
not  true  in  real  world  images,  where  edges  of  various 
orientations  and  heights  are  present.  The  optimum 
threshold  tor  this  general  case,  can  be  derived  it  the 
statistical  properties  of  the  image  is  known. 

Third,  there  is  no  efficient  procedure  to  utilize  the 
additional  information  about  the  edge  height  and 
orientation.  Also,  the  best  compromise  between  the  mask 
size,  the  number  of  masks  used,  and  the  distance  between 
consecutive  applications  of  the  edge  detector,  is  not  yet 
known . 

The  previous  problems  can  be  solved  if  a statistical 
image  model  is  derived.  This  model  will  help  in  extending 
the  techniques  of  this  dissertation  to  the  higher  level  of 
image  understanding,  such  as  edge  linking  and  the 
recognition  of  image  objects. 
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Appendix  h 

Analysis  of  the  Hueckel  Algorithm 

Although  the  Hueckel  algorithm  possesses  a 
theoretically  optimum  performance,  there  are  two  basic 
difficulties  with  the  practical  application  of  the 
operator.  The  first  concerns  the  effect  of  truncation  of 
the  orthogonal  expansion,  while  the  second  results  from 
inaccuraci.s  in  the  minimization  procedure  and  in  the 
computation  of  edge  parameters.  These  two  problems  will  be 
discussed  in  the  following  sections.  in  Section  A-l , a 
review  of  the  Hueckel  algorithm  is  given.  In  Sectiors  A. 2 
and  A. 3,  the  various  difficulties  with  the  algorithm  are 
considered . 

A . 1 A Review  of  the  Hueckel  Algorithm 

The  Hueckel  algorithm  starts  with  the  image- 
intensities  defined  over  a circular  image  subregion.  A 
polar  Fourier  expansion  of  the  image  subregion  is 
calculated,  using  the  orthogonal  functions  given  by 
Eqs.  H.7*  through  H.8.  The  e.. pension  is  truncated  to  the 
first  nine  coefficients,  , a. , . . . ,a^  . These  coefficients 

•This  notation  indicates  equations  in  Hueckel's  paper  (14]. 
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are  compared  with  the  ideal  edge-line  model  coefficients, 

(s  ,s  ,...,s  ).  Expressions  of  s.  are  given  by  Eqs.  H.9 
u 1 o 1 

through  H . 1U . 


Acceptance  of  the  edge  fitting  is  based  on  three 
sequential  decisions,  each  decision  taken  as  soon  as  the 
information  needed  is  available.  The  first  decision  is 
based  on  the  inequality:  If 


8 


£ 

i -n 


2 
a . 


l 


( A . 1 ) 


then  classify  the  subregion  as  no-edge.  Equation  A.l 
discards  the  image  subregions  whose  input  amplitude  varies 
less  than  that  of  a central  edge  of  step  height  1.5. 


The  error  between  the  ideal  and  the  actual  signals  can 
be  expressed  in  the  form 


N = 


la] 


1 2 
?a2 


i 2 , ; 

2a3  f a4 


♦ a. 


1 c.  A _ * 

?a6  + ^7 


-M(Cx,Cy)  + fUa.],£} 

where 


(A. 2) 


M(CX-V  ' (e2Cx  * e3Cy’ 2 + e4Cx  + e5Cy  (A-3) 

C = c ^ - c ^ (A. 4) 

x x y 

C = 2c  c (A. 5) 

y x y 
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The  ejsare  defined  between  Eqs . H.17  and  H.19,  while  f{  • ,•} 

corresponds  to  the  last  five  terms  in  Eq.  H.12.  The  vector 

£ is  the  ideal  edge  parameter  vector  defined  in  Eq.  2.23. 

The  best  edge  fitting  is  obtained  by  changing  the  parameter 

vector  £,  untill  N becomes  minimum.  Hueckel  argued  that  at 

the  minimum  N,  the  function  f{*,*}  vanishes.  Hence  to 

minimize  N,  it  is  sufficient  to  maximize  M{C  ,C  ) over  C 

x y x 

and  C . The  maximization  of  M(C  ,C  ) is  achieved, 
y x y 

approximately,  by  Eqs.  H.2U  through  H.21. 


The  signal  power,  2^s^,  is  evaluated, 

coefficients  [a  1 and  the  parameters  C and  C . 

1 iJ  x y 

second  edge  fitting  decision  is  based  on  the 
Classify  the  subregion  as  no-edge  if 


using  the 
Then,  the 
criterion: 


2 Esi  < <A-6> 

This  inequality  indicates  that  the  noise  power  exceeds  the 
signal  power. 

The  parameters  r ,r^,t  #t  and  b_  are  calculated  by 
Eqs.  H.23  through  H.2!/.  These  equations  satisfy  the 
condition  of  f{.,.}  being  zero  when  N is  minimum. 

The  final,  and  most  important  decision  in  the  edge 
fitting  procedure,  is  to  compare  the  fitting  error  with  a 
variable  threshold.  This  is  described  by  the  criterion:  If 
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(A. 7) 


^(ai-si)2  < Conf(Ss^)  - Diff 

classify  the  subregion  as  an  edge.  The  constant  "Cont" 
relates  to  the  edge  distinctness  and  "uitf"  relates  to  the 
edge  pronouncedness.  In  evaluating  Eq.  A. 7,  different 
forms  of  [s  i]  are  used  tor  the  three  models,  general 
edge-lines,  edges,  and  lines,  respectively. 

The  previous  discussion  reviewed  the  basic  concepts  of 
the  Hueckel  algorithm.  It  should  be  noticed  that  while  the 
algorithm  is  theoretically  optimum,  it  suffers  from  some 
def f iciencies  in  its  practical  application.  These 
def f iciencies  will  be  explained  in  the  following  sections. 

A , 2 Effect  of  Truncation  of  the  Orthogonal  Expansion 

Hueckel  assumed  that  the  use  of  eight,  and  later  of 

nine,  coefficients  of  the  orthogonal  expansion  will  not 

affect  the  edge  fitting  performance  because  real  edges  are 

blurred  and  thus  have  small  high  spatial  frequency 

components,  while  these  high  frequency  components  usually 

result  from  noise.  . This  assumption  is  not  true,  especially 

it  the  subregion  contains  a line.  To  determine  the  effect 

of  this  approximation,  the  first  nine  Fourier  coefficients 

of  image  subregions  containing  ideal  edges  and  lines  are 

calculated  and  then  used  to  reconstruct  the  original 

signal.  The  original  and  reconstructed  signals,  in  the 

case  of  ideal  central  edge  and  ideal  lines  of  width  1 and 
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show 


the 


3,  are  given  in  Figure  A.l.  The  results 
distortion  introduced  by  truncation,  especially  in  the  case 
of  thin  lines. 

The  previous  experiment  leads  to  two  questions:  The 
fit st,  what  is  the  advantage  of  an  optimum  procedure  if  the 
models  used  are  far  from  ideal?  It  should  be  noticed  that 
the  Hueckel  algorithm  suffers  from  difficulties  in  the 
detection  of  very  thin  lines  [27].  This  can  be  explained 
by  the  fact  that  the  first  nine  coefficients  of  the  Fourier 
expansion  are  not  sufficient  to  represent  thin  lines 
accurately.  The  second  question  is,  are  the  orthogonal 
functions  chosen  by  Hueckel  the  best  for  the  truncated 
expansion?  This  point  is  not  important  if  an  infinite 
expansion  is  used,  as  long  as  the  orthogonal  functions  form 
a complete  space.  However,  if  a truncated  expansion  is 
used,  it  is  important  to  choose  orthogonal  functions  that 
are  more  sensitive  to  the  ideal  signals  of  interest.  This 
is  not  the  case  in  the  Hueckel  algorithm,  where  the 
functions  H;  are  chosen  such  that  the  optimization 
procedure  can  be  solved  analytically. 

A. 3 Effect  of  Inaccuracy  in  the  Minimization  Procedure. 

The  minimization  procedure  implemented  by  Hueckel 
suffers  from  difficulties  that  results  in  a suboptimum 
solution.  These  difficulties  are  summarized  as  follows: 
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a)  central  edge 


Figure  A.l.  Ideal  and  reconstructed  edge  and  line 
models.  ideal,  • reconstructed 


129 


0 

1 0.4 

(U 

x 


6 8 10 


d is  tone  e 


b)  line,  width  = 1 


fli 

3 1.0 


1 0.5 


6 6 10 


distance 


c)  line,  width  = 3 


Figure  A.l.  (Continued) 
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First,  in  the  minimization  procedure  the  parameter 
vector  (p)  is  allowed  to  assume  complex  values  and  also  to 
indicate  edges  with  centers  outside  the  circular  subregion. 
Although  the  previous  two  conditions  do  not  represent 
acceptable  solutions,  Hueckel  has  to  allow  these 
generalized  form  to  simplify  the  algorithm.  The  parameters 
are  readjusted  by  neglecting  the  imaginary  parts,  and 
ignoring  the  edges  whose  centers  are  outside  the  circular 
subregion.  It  is  clear  that  this  solution  will  not  be  the 
same  as  the  optimum  solution  obtained  with  the  previous 
constraints  taken  into  consideration. 

Second,  the  replacement  of  the  minimization  of  Eq.  A. 2 
by  the  maximization  of  Eq.  A. 3 is  based  on  the  assumption 
that  the  minimum  of  f { l a^  ] , } is  zero.  This  assumption  is 
valid  only  if  £ is  real  128],  which  is  not  true  in  general. 
In  fact,  for  the  terms  of  f (|e.  ) ,£}  to  vanish,  the 
following  equations  should  be  satisfied 


bi(cx,cy)  = X+  + (A. 8a) 

b2(cx,cy)  = X+r+  + A_r_  (A. 8b) 

b3*cx,cV  = X+r+  + X-r-  (A. 8c) 

b4(cx'cy)  = A+r+  + X-r-  (A.8d) 
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where  the 


b' s are  functions  of  the  set  ( 1 and  the 
1 i 

parameters  and  c^,  while  A+  are  defined  as 

A+  = t+(3TT)%(l-rJ)2/4  (A,9) 

It  is  clear  that  the  solution  of  Eq.  A. 8 will,  in  general, 
result  in  complex  values  of  A+,A_,r+,r_.  A real  solution 
will  be  guaranteed  if  and  only  if  the  image  subregion 
corresponds  to  an  ideal  edge  model. 


Third,  in  arranging  the  terms  in  Eq.  H.12,  s8  is 
artificially  set  equal  to  ag.  This  assumption  cannot  be 
justified.  As  a result  of  this  constraint,  the  accuracy  of 
the  second  Hueckel  algorithm  (14),  is  not  expected  to  be 
better  than  the  accuracy  of  his  first  algorithm  18).  It 
seems  that  sg  was  made  equal  to  a g only  to  simplify  the 
minimization  procedure. 


A quantitative  evaluation  of  the  effect  of  the 
previous  approximations  on  the  Hueckel  operator  performance 
would  be  quite  involved.  Such  an  evaluation  is  not 
attempted  here.  Instead,  an  experimental  evaluation  of  the 
operator's  performance  is  given.  In  the  experiment,  the 
Hueckel  operator  is  applied  on  the  vertical  test  image 
introduced  in  Chapter  5.  The  figure  of  merit  is  plotted  as 
a function  of  s ignal-to-noise  ratio  for  different  choices 
of  Hueckel 's  parameters,  Conf  and  Ditt.  These  plots  are 
shown  in  Figure  A. 2.  It  can  be  noticed  that  the 
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performance  of  the  Hueckel  operator  is  inferior  fo  that  of 
the  simple  operators  given  in  Chapter  6,  and  it  is  also 
inferior  to  the  edge  fitting  algorithm  introduced  in 
Chapter  7. 
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Appendix  B 

Orthogonal  Transformation  in  Edge  Detection 

One  of  the  early  applications  of  orthogonal 
transformation  in  edge  detection  was  given  by  Hueckel  in 
his  edge  fitting  algorithm  1 8 r 1 4 ] . The  method  implements  a 
truncated  polar  Fourier  expansion  in  the  fitting  procedure. 
Lauer,  a simplified  version  of  the  Hueckel  algorithm  was 
introduced  by  Mero  and  Vassy  1 29 j . In  this  procedure,  only 
two  of  the  Fourier  components  are  used  in  the  edge 
detection.  This  simplification  results  in  unacceptable 
loss  in  performance  when  detecting  noisy  edges  [30], 

In  the  Hueckel  algorithm,  the  orthogonal 
transformation  was  used  to  simplify  the  edge  fitting 
procedrre.  A different  application  of  the  orthogonal 
transformation  is  to  use  it  as  a multidimensional  rotation 
of  the  feature  space  (31j.  This  approach  can  be  useful  if 
the  edge  and  no-edge  features  are  enhanced  by  the 
transformation.  The  following  sections  discuss  this  new 
approach.  In  Sections  B.l  and  B.2,  calculations  of  the 
Fourier  components  of  different  edge  and  line  models  are 
given.  In  section  B.3,  a preliminary  analysis  of  the 
performance  of  this  new  technique  is  introduced. 
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B.l  Edge  models  in  the  Discrete  Fourier  Domain 

The  edge  model  in  the  spatial  domain  is  sketched  in 
Figure  B.l.  The  edges  are  assumed  to  have  one  of  the  four 
basic  orientations:  vertical,  horizontal,  positive  slope 
diagonal,  and  negative  slope  diagonal.  Central  edges  are 
considered  first,  and  then  the  analysis  is  extended  to 
non-central  edges.  If  the  edge  is  described  by  the 
function  f(j,k),  where  -N  <j,k  N,  the  corresponding 
Fourier  coefficients  mc,v)  are  defined  as 

P(u,v),...i  Z Ef(j,io»jutkv  (B.D 

(2N+1)  k=-N  j=-N 

where 


In  many  cases,  the  corresponding  discrete  Fourier 
coefficients  can  be  derived  in  closed  forms.  As  an 
example,  in  the  case  of  the  central  vertical  edge  shown  in 
pigure  E.lb,  the  Fourier  coefficients  are  of  the  form 

Fy(0,0)  = b + j (B.3) 

otherwise 


Fvlu,0)  = 


L 

Jn+T 


1 

7 


(N+l ) u 


u , 
w -1 


u ' 
w 


(B.  4) 
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Figure  B.l.  Edge  models  for  the  discrete  Fourier  transform 


F (u,v)  =0  v j 0 (B . 5) 

In  the  case  of  the  central  diagonal 
Figure  B.ld,  the  Fourier  coefficients  are 

fi/4(0,°)  = b + | 

otherwise 

h wNu 

ftt/4(u'0)  =-(2N+iy 

w - i 

^ w~Nv 

F7r/4(°,v)  = ITT 

w - i 

F-it/4  (u»v)  = 0 

Similar  expressions  can  be  obtained  for  edges  with  ♦ = V2 
and  $ * 3 tt/4  . 

The  previous  analysis  can  be  extended  to  the  case  of 
noncentral  edges  and  edges  with  general  orientation.  To 
avoid  repetition,  only  one  of  these  general  cases  is 
considered.  This  is  the  case  of  the  displaced  vertical 
edge  shown  ir.  Figure  B.lc.  The  corresponding  discrete 
Fourier  components  are  given  by 

F^  (0,0)  = b ♦ j - (jJjviT  (B.10) 


edge  shown  in 


(B.  6 ) 


u + 0 (B. 7 ) 


v + 0 (B . 8 ) 


u , v f 0 (3.9) 
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otherwise 


^2,  (U' v) 


h £,u 
(2N+1)  W 


(N+l-2, ) u u 
w -w 

u , 
w - 1 


(B.ll) 


(u, v)  = 0 v ^ 0 (B. 12 ) 

The  discrete  Fourier  coefficients  in  the  case  of  a 5x5 
central  edge  with  different  orientations  are  calculated. 
Results  are  tabulated  in  Figure  B.2.  From  these  results, 
it  is  clear  that  edc  e*  icntation  can  be  determined  from 
the  Fourier  coefficients.  A decision  strategy  based  on 
these  Fourier  coefficients  will  be  given  in  Section  B.3. 


B.2  Line  Models  in  the  Discrete  Fourier  Domain 


Line  detection  was  excluded  from  this  dissertation  for 
two  reasons.  First,  lines  can  be  detected  as  two 
ccnsecutive  edges,  especially  it  the  edge  detector  used, 
possesses  smell  masks.  Second,  template  matching  line 
detectors  suffer  from  the  problem  of  being  very  sensitive 
to  the  line  orientation  and  position,  and  so  far,  it  seems 
there  is  no  practical  solution  to  this  problem.  It  is 
hoped  that  the  sensitivity  problem  can  be  avoided  by  using 
the  discrete  Fourier  transformation.  This  approach  will  be 
introduced  in  the  following  paragraphs. 


Figure  B.3  shows  discrete  models  for  one-pixel-width 
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b)  Diagonal  line 

Figure  B.3.  A one-pixel -line  model 
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linos  with 


vortical  and  diagonal  orientations. 


Tho 


vertical  line  has  the  Fourier  coefficients 
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The  diagonal  line  has  the  Fourier  coefficients 
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otherwise 
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In  the  case  of  a vertical  line  at  a distance 
the  origin  in  the  spatial  plane,  the  discrete 
components  become 


(0  13) 

(B.14) 

(B. 15) 

(B. 16) 

(B. 17) 

(B. IB) 

(B. 1 9 ) 

1 from 
Fourier 
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(B. 20 ) 


otherwise 


F ^ (0,0) 


b -f 


h 

2N+1 


F£(u,0) 


hw 


£u 


2N+1 


(B. 21 ) 


F^(u,v)  = 0 v 

It  should  be  noticed  that  the  only 
Fourier  components  of  a central 
line  is  a phase  factor  in  F^(u,0). 
pronounced  in  the  case  of  shifted  d 


* 0 


(P . 22 ) 


difference  between  the 
and  a displaced  vertical 
The  changes  are  more 
iagonal  lines. 


The  Fourier  coefficients  of  a 5x5  central  line  with 
different  orientations  are  calculated  and  results  are 
tabulated  in  Figure  B.4.  Again  it  is  clear  that  the  line 
orientation  can  be  determined  from  the  Fourier 
coefficients. 
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Figure  B.5.  Detection  of  a rotated  vertical  line 
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The  value  of  Fr(l,0)  is 

F (1,()  = 0. 1309h  (B . 25 ) 

r 

This  represents  a ratio  of  0.6b  of  the  value  FV(1,0).  On 
the  other  hand,  it  the  template  matching  operator,  shown  in 
Figure  B.5b,  is  used,  the  output  in  the  case  of  the  rotated 
line  will  be 


X = 7 . 5h  (B. 26 ) 

r 

This  represents  a ratio  of  0.375  of  the  value  Xv  . 

B.3  Performance  Analysis  of  the  Discrete  Fourier  Transform 
Edge  Detector 

The  performance  of  the  previous  edge  and  line 
detectors  can  be  evaluated  using  the  statistical  model  of 
Chapter  3.  In  this  model,  the  spatial  function  f ( j , k ) is 
the  sum  of  a signal  and  a noise  component 

f ( j , k)  = f(j,k)  + n(j,k)  (B.27) 

where  n(j,k)  is  an  additive  white  Gaussian  noise  with  zero 

mean  and  standard  deviation  y.  The  corresponding  discrete 

Fourier  coefficients  F(u,v)  will  be,  in  general,  complex 

random  variables.  The  real  and  imaginary  components  of 
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F(u,v)  can  be  arranged  in  the  vector  form 


FR(-N,N) 
Fj  (-N ,N) 

Fr(N,-N) 
Fj (N,-N) 


where  F is  a joint  Gaussian  vector  with  mean 

E E?(3.k)cospM-NM 

j k L J 


F = 


(2N+1) 


-E  E f ( j ,k) sin 

j k 


r 2tt  (Nj-Nk)l 

L 2n+T  J J 


and  the  covariance  matrix  is 


(B. 28 ) 


(B. 29 ) 


2 (2N41) 


(B.  30 ) 


In  Eqs  . B2  8 to  B.3U,  the  term  corresponding  to  F(0,0)  is 
excluded.  Therefore,  the  identity  matrix  I is  of  size 
2 [ ( 2N+1 ) 2 -1 ] . 


The  fact  that  the  different  components  of  F are 
independent  Gaussian  simplifies  the  performance  evaluation. 
As  an  example,  it  the  decision  strategy  is  to  detect  a 
vertical  edge  when 


|FX (1,0)  | > tx 


(B . 31  ) 


the  probability  of  correct  detection  of  a vertical  edge  is 
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P(vertical  edge | verti ual  edge) 
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Better  performance  can  be  achieved,  however,  if  edge 
detection  is  based  on  simultaneous  comparison  of  the 
Fourier  coefficients.  Thus,  edge  detection  becomes 
multiple  hypotheses  testing  in  a vector  space.  This 
approach  needs  further  investigation. 
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Appendix  C 

Derivations  of  Eqs.  3.29,  3.31  and  3.32 

In  deriving  these  equations,  it  should  be  noticed  that 
the  equation 


A = | X { + | Y | 

corresponds  to  lines  1,  2,  3 and  4 in  Figure  C.l.  Thus  the 
probability  density  function  p(A)  is  given  by 


P (A  J 


f pY (A-Y) pv (Y ) dY+  f 

J y=0  X 'y=0 

! 0 

+ pY ( -A-Y ) pv (Y ) dY+ 

JY=-A 


Px(Y-A)pY(Y)dY 

0 (C.2) 

Py (A+Y ) p (Y) dY 
J Y=-A  X Y 


and 

P ( A< t ) 


[ 0 ( Y 

pY(X)pv(Y)dXdY 

' Y=-t' X=-Y 

+ f t f Y PY (X) pv (X)dY 
J y=0 J x=-Y  * 


(C.  3 ) 
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Appendix  D 

The  Herskovits  Algorithm 


i 


Concepts  of  statistical  detection  theory  were  first 
utilized  in  the  design  of  edge  detectors  by  Griffith  [9], 
Yakimovsky  [10],  and  Herskovits  [26] . A brief  discussion 
of  Griffith  and  Yakimovsky  techniques,  was  given  in 
Chapter  1.  In  this  appendix,  a discussion  of  the 
Herskovits  approach,  and  its  resemblance  to  the  statistical 
analysis  of  Chapter  3 is  given. 

Herskovits  was  interested  in  processing  images  that 
contain  polyhedra.  The  edges  of  a polyhedron  can  be  in  the 
form  of  ideal  or  defocussed  steps  and  roofs.  These 
intensity  models  should  be  distinguished  from  the  unwanted 
signals  that  take  the  form  of  constant  slow  slopes  and 
Gaussian  noise. 
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and  6 is  a fixed  interval.  A two  sided  cutoff  (a)  is  put 


on  D(x)  so  that  if  |D(x)j<a,  then  D(x)  is  set  to  0.  Next, 

the  function  F (x)  is  computed  as 
s 

6 5 

F (x)  = ^ sg  (D  (x+i)  ) - ^2  sg  (D (x-*i)  ) (D.2) 

s i=l  i=l 

where 

II  x > 0 

0 x = 0 ( D . 3 ) 

-1  x < 0 

Actually,  F (x)  is  computed  over  a two-dimensional 

neighborhood.  Finally,  local  maxima  of  F (x)  ate  found, 

s 

and  a line  fitting  procedure  builds  the  complete  edge  1321. 

The  edge  detector  parameters  were  chosen  to  maximize 
the  probability  of  correct  detection  for  a given 
probability  of  false  detection.  This  approach  resembles 
the  statistical  analysis  introduced  in  Chapter  3.  The 
basic  differences  between  the  Herskovits  technique  and  the 
analysis  of  this  dissertation  can  be  summarized  in  the 
f ol lowing . 

First,  Herskovits  was  interested  in  a limited  domain 
of  images.  Thus,  the  class  of  edges  and  no-edges  were 
determined  by  a priori  knowledge  of  the  image  contents  and 
the  imaging  process.  This  kind  of  knowledge  was  not 
implemented  in  the  present  dissertation. 
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Second,  in  the  analysis  given  by  Herskovits,  edges 
were  assumed  to  be  vertical.  To  detect  other  edge 
or ientat  ions , the  operators  should  be  rotated.  This 
assumption  simplified  the  derivation  of  a statistical 
model,  but  limited  its  application.  The  analysis  given  in 
Chapters  2 and  J ot  this  dissertation  is  based  on  a general 
edge  model,  that  has  been  used  in  evaluating  the 
perlormance  ot  different  edge  detectors. 

Third,  Herskovits  was  attempting  to  achieve  an  almost 
error  tree  cetection  because  the  systems  used  to  recognize 
polyhedra  are  very  sensitive  to  errors  introduced  in  the 
low  levels  ot  image  processing.  Tt  seems  that  a better- 
strategy  ot  image  understanding  systems  should  allow  for 
larger  probability  ot  error  at  the  low  levels,  that  can  be 
improved  later  by  feedback  from  the  high  levels  ot  image 
process  1 ng . 
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Appendix  F. 

Experimental  Results 

The  models  used  in  edqe  detectors  evaluation  assume 
that  images  consist  of  ideal  steps  or  ramps  affected  by 
additive  white  Gaussian  noise.  In  real  world  pictures, 
however,  noise  is  often  considered  to  be  the  irrelevant 
image  intensities  such  as  the  background.  It  is  important 
to  determine  edge  detector  performance  for  both  artificial 
and  actual  image  models. 

A simple  procedure  to  achieve  this  comparison  is  to 

test  the  different  edge  detectors  using  real  world 

pictures.  Examples  of  this  experiment  are  shown  in  Figures 

E.l,  F..2  and  E.3.  In  these  examples,  the  3x3  Prewitt 

operator,  the  1x3  and  7x7  .'-level  operator,  the  7x7  edge 

fitting  algorithm  and  the  Hucckel  operator  are  applied  on 

test  pictures  containing  a girl,  an  airport  and  a tank. 

The  thresholds  for  the  Hueckel  and  the  edge  fitting 

algorithms  are  fixed  at  optimum  values;  Conf  = 0.85, 

Ditt  = 100,  for  Hueckel  and  t = .045  for  the  edge  fitting 

algorithm.  The  thresholds  for  the  Prewitt  and  the  3-level 

operators  arc  chosen  so  that  the  number  of  edges  detected 

equals  the  number  of  edges  detected  by  the  Hueckel 
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Figure  E.l.  (Continued) 


1 56 


a)  original  b)  3x3  mask,  Prewitt 

operator 


c)  3x3  mask,  3-lcvel  d)  7x7  mask,  3-level 

operator  operator 


Figure  E.2.  Examples  of  edge  maps,  airport  picture 
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c)  7x7  mask,  edge 
fitting  operator 


Figure  E.2. 
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f)  69  pixels,  Hueckel 
operator 


(Continued) 
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c ) '5x3  mask,  3 -- 1 evo  1 

operator 


d)  7x7  mask,  3-levol 
opera  tor 


Examples  of  edqo  maps,  tank  picture 
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Fiqure  E.3. 


algorithm. 


In  comparing  the  performance  of  the  edge 
enhancement/thresholding  operators  with  that  of  the  edge 
fitting  algorithms,  it  is  seen  that  the  edge  fitting 
algorithms  are  better  able  to  outline  the  "usually" 
relevant  scene  content.  This  results  from  the  more  general 
edge  models  used  in  the  edge  fitting  algorithms,  that  allow 
for  detection  of  out-of  focus  objects.  Also,  it  should  be 
observed  that  while  the  edge  fitting  algorithms  use  fixed 
thresholds,  the  thresholds  of  the  edge 
enhancement/thresholding  operators  have  to  be  varied  for 
different  images. 

For  the  edge  enhancement/ thresholding  operators,  the 
3x3  Prewitt  and  3-level  operators  have  practically  the  same 
performance.  Also,  the  effects  of  increasing  the  mask 
size,  namely,  suppression  of  noise  and  lowering  the 
operator  resolution,  are  apparent  in  the  tank  pictures. 

The  new  edge  fitting  algorithm  has  better  performance 
than  that  of  the  Hueckcl  operator  because  the  new  algorithm 
pr<  serves  more  of  the  relevant  structure  of  the  pictures. 

These  observations  have  been  predicted  previously  in 
the  dissertation.  This  shows  that  there  is  a correlation 
between  the  artificial  and  actual  image  models.  Further 
investigation  of  this  assumption,  based  on  quantitative 
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measurements  is  still  needed 
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